Data Engineer.

Location: Granadilla and San Carlos, Costa Rica

Job Overview

You will be part of a team turning data into actionable insights using algorithms and machine learning. You must have experience using a variety of data wrangling and analysis methods, with a variety of tools. You will validate customer data on entry and perform exploratory data analysis to propose the next steps in data cleansing and feature engineering. The decisions that you help make about variable selection will be translated into data pipelines that feed the modeling and visualization processes. To do so, problem-solving and strong analytical skills are a must, as well as a basic understanding of statistics. You have to be curious and enthusiastic about data, and have the necessary communication skills to explain your findings. We are seeking a Data Scientist with a versatile skill set to help our data-driven customers succeed.

Responsibilities

  • Enhancing the data handover and quality assessment procedures to improve the prioritization process for subsequent steps.
  • Processing, cleansing, and verifying the integrity of data used for analysis.
  • Executing exploratory data analysis (EDA) and presenting results in a clear manner.
  • Maintain clear and coherent communication within the team and with the customer, both verbal and written, to understand data needs and report results.
  • Use machine learning tools and statistical techniques to select features, build and optimize classifiers.
  • Stay curious and enthusiastic about using algorithms to solve problems and enthuse others. Mentor team members in their data science careers.
  • An interest in data and being able to solve problems logically and methodically.
  • Assess the effectiveness and accuracy of machine learning methods and data cleansing techniques.
  • Develop custom data models and algorithms to apply to data sets.
  • Use predictive modeling to increase and optimize customer experiences, revenue generation, ad targeting, and other business outcomes.
  • Develop processes and tools to monitor and analyze model performance and data accuracy.
  • Perform queries on the data to discern data anomalies, and perform quality audits.
  • Measure and communicate the accuracy of data products (machine learning models, algorithms).

Skills and Qualifications (Required)

  • Solid understanding of basic statistics (parametric and non-parametric).
  • Solid understanding of a variety of machine learning techniques (clustering, decision tree learning, artificial neural networks, etc.), their assumptions, and their real-world advantages/drawbacks.
  • Programming experience, for example in R or Python. Excellence in at least one programming language is highly desirable.
  • Strong problem-solving skills, oriented towards product delivery.
  • Experience creating and implementing data analysis pipelines.
  • Knowing how to deal with imperfections in data using data cleaning and data wrangling.
  • Good English communication skills and a collaborative approach to sharing ideas and finding solutions.
  • Team player, flexible, and creative.
  • Critical attitude towards your work and deliveries.
  • Ability to come up with solutions independently, even for abstract problems.
  • Bachelor’s Degree in a scientific field that has provided experience with experimental design, data collection, data analysis, visualization and reporting, and 2-4 years of relevant work experience.

Preferred (Nice to have)

  • Experience with applied statistics, such as distributions, statistical testing, regression, maximum likelihood estimators, etc. Understanding when different techniques are (or aren’t) a valid approach. — Knowledge of advanced statistical techniques and concepts (regression, properties of distributions, statistical tests and proper usage, etc.) and experience with applications.
  • Experience with data visualization frameworks, such as D3.js and vega. It is important to not just be familiar with the tools necessary to visualize data, but also the principles behind visually presenting data and communicating information.
  • Experience with machine learning frameworks such as TensorFlow, Keras, and torch.
  • Experience with visualization and analysis libraries such as scikit-learn, matplotlib, bokeh, pandas (Python) or dplyr, ggplot2, sparkr (R).
  • Communication skills in describing findings, or the way techniques work to audiences, both technical and non-technical.
  • Proficiency in using database query languages such as SQL.
  • Familiarity with Big Data or cloud environments (e.g. AWS, Google Cloud Platform, etc.).
  • Master’s Degree in a scientific field that has provided experience with experimental design, data collection, data analysis, visualization, and reporting.
  • At least 3 years of experience in mathematical modeling and model optimization.
  • At least 3 years of experience in predictive analytics using statistical modeling techniques to conduct forecasting.
  • At least 1 year of experience in advanced analytics techniques in a consulting environment and not solely in an academic environment.
  • Fluent in English.

Apply For This Job

  • Accepted file types: pdf, doc, docx.