Data Scientist/Engineer
Description

Duties and Responsibilities:

  • Data Scientist:
    • Data Analysis: Perform exploratory data analysis to identify trends, patterns, and relationships within the data. Use statistical methods and machine learning algorithms to derive insights and predictions.
    • Model Development: Develop and implement predictive models and machine learning algorithms to solve business problems. This includes feature engineering, model selection, and evaluation.
    • Data Cleaning and Preprocessing: Cleanse, preprocess, and transform raw data into a usable format. Handle missing values, outliers, and inconsistencies to ensure data quality.
    • Data Visualization: Create visualizations and dashboards to communicate findings and insights effectively. Use tools like Matplotlib, Seaborn, Tableau, or Power BI to present complex data in a clear and intuitive manner.
    • Collaboration: Work closely with cross-functional teams, including business stakeholders, software engineers, and domain experts, to understand requirements and deliver solutions that meet business objectives.
    • Research and Innovation: Stay updated on the latest trends and advancements in data science and machine learning. Explore new techniques and methodologies to improve model performance and drive innovation.
    • Documentation: Document methodologies, algorithms, and findings in a clear and concise manner. Maintain thorough documentation to ensure reproducibility and scalability of analyses.
  • Data Engineer
    • Data Pipeline Development: Design, implement, and maintain scalable and reliable data pipelines to ingest, process, and transform large volumes of data from diverse sources. This includes batch and real-time data processing.
    • Data Modeling: Design and implement data models and schemas to support analytics, reporting, and machine learning applications. Optimize data structures for efficient querying and analysis.
    • Data Integration: Integrate data from multiple sources, including databases, APIs, streaming platforms, and third-party services. Ensure data consistency, integrity, and accuracy across different systems.
    • Data Warehousing: Design and maintain data warehouse solutions to store and organize structured and unstructured data for analytical purposes. Implement data partitioning, indexing, and optimization techniques for performance and scalability.
    • ETL (Extract, Transform, Load): Develop and maintain ETL processes to extract data from source systems, transform it according to business requirements, and load it into target systems. Monitor and troubleshoot ETL jobs to ensure data completeness and quality.
    • Data Quality and Governance: Implement data quality checks and validation rules to ensure the integrity and reliability of data. Establish data governance practices and policies to maintain data security, privacy, and compliance with regulatory requirements.
    • Performance Tuning: Identify and resolve performance bottlenecks in data processing and storage systems. Optimize query performance, data partitioning strategies, and resource utilization to improve overall system efficiency.
      Infrastructure Management: Manage data infrastructure components such as databases, data lakes, data warehouses, and cloud services. Monitor system health, capacity, and availability, and implement disaster recovery and backup solutions.
    • Collaboration: Work closely with cross-functional teams, including data scientists, analysts, software engineers, and business stakeholders, to understand data requirements and deliver solutions that meet business objectives.
  • Performs other duties as required

Professional and People skills:

  • Ability to understand business context and drivers of BI projects and/or asks questions about those drivers
  • Self-starter who can logically guide their own work (“what’s the next step?”), of course with some assistance and direction
  • Exhibits thoroughness, completeness, and quality in work products, such as:
    • Frequently checking own work for common-sense validity
    • Is consistent and organized in work products, documentation, and code
  • Applies creative thinking to problem solving
  • Ability and willingness to troubleshoot when something isn’t working
  • Collaborative nature (though working alone is great and necessary!), can easily consult others for input, or readily puts forth ideas as part of a team

Developmental Goals:

  • We want people to join our team and work with us for the long term, and who are interested in their own professional development and trajectory with Parsons.
  • Readily gives and receives feedback
  • Willingness to learn new skills and train on new technologies or concepts

Technical Skills:

  • Data Scientist
    • Experience with statistical software (e.g., R, Python) for statistical analysis/modeling of operational data and libraries for data visualization (e.g. matplotlib, ggplot, seaborn).
    • Ability to understand and utilize various data sources (internal, external, structured, unstructured) for analytical insights.
    • Experience querying data in SQL
    • Experience with version-control systems such as Git
    • Experience with machine learning frameworks (e.g., scikit-learn, TensorFlow, tidymodels) and data analysis libraries (e.g., NumPy, Pandas, tidyverse) a PLUS
    • Ability to understand and implement statistical or ML algorithms in R/Python a PLUS
    • Ability to write efficient, production level code a PLUS
    • Knowledge of cloud computing services (AWS, Azure, GCP) a PLUS
  • Data Engineer
    • SQL Server
      • Intermediate level t-SQL development skills
      • Build / troubleshoot complex stored procedures, functions & views
      • basic database management (security, backup/restore)
    • Practical experience with programming languages such as c#.net/python for data manipulation
    • Understanding and practical experience with ETL tools (either on-premises (SSIS/t-sql) and/or Cloud (Azure data factory, AWS Glue) for data transfer between various data sources
    • Understanding of data modeling techniques (dimensional modeling)
    • Experience with MS Fabric features (e.g., Data warehouses, lake houses, data pipelines) a PLUS
    • Experience with Oracle Cloud and/or PL-SQL a PLUS
    • Experience developing with Informatica, AWS and/or Azure data services a PLUS

Job Type:

  • Full-Time

Salary:

  • $22.00/Hour (Plus Benefits)

Location:

  • Dallas, TX

Benefits:

  • Free Individual Health Insurance
  • Free Training (Program specific)
  • Paid Vacation
  • Paid Company Holidays
  • Education Assistance/Reimbursement (Toward first degree - Bachelors/Associates)
  • Individual Mentor 
  • 401k Retirement Savings
  • Interest free loans (Case basis)
  • Benefits valued at up to $25,000.00 annually
ApplicantStack powered by Swipeclock