The Data Scientist is responsible for designing, developing, implementing, and testing statistical and machine learning models that are the core components and integrated modules of Cognistx’s products and services. This role requires continuous improvement and maintenance of the algorithms and code routines. This position participates in the design, planning, and coding activities with other staff members such as Software Engineers, Product Managers, Cloud Specialists. An additional key function of the role is client interaction and communication in order to gather requirements and present results.
- Client data science requirements gathering to define analytic/predictive objectives of Cognistx projects and/or products, involving different industries and backgrounds
- Data gap analysis, followed by the development and execution of data analysis plans
- Execution of data analysis projects by a) documenting the client’s requirements (including statistical analysis & visualization of data, data sampling, cleaning, and preprocessing), b) design, creation, and optimization of statistical machine learning models to provide desired analytic capability (clustering, classification, matching/retrieval, predictive analysis, regression, anomaly detection, etc.) on large-scale, real-world datasets
- Identify tradeoffs among data science techniques and contrast design alternatives, within the context of specific data science projects; apply and customize ensembles of trained models and coded modules for application-specific data science requirements and objectives
- Best-practices implementation, maintenance, and continuous improvement of created platforms and/or models, along with formal documentation and design/documentation reviews
- Writing, communicating, and presenting results and findings to both technical and non-technical audiences
- Research activities on special projects focused on increasing the company’s specialization and knowledge in a specific field related to Machine Learning and Artificial Intelligence.
- Some projects will require participation in defining the Statement of Work and project proposal to clients. Specifically defining the statistical models to implement based on the business objectives.
- Please note this job description is not designed to cover or contain a comprehensive listing of activities, duties, or responsibilities that are required of the employee for this job. Duties, responsibilities, and activities may change at any time with or without notice.
Required Education and Experience
- Masters degree in Computer Science or related (Data Science, Computer Engineering, Software Engineering, Statistics, Mathematics, Information Systems Management, etc.)
- Good knowledge of SQL, NoSQL, and Python data analysis-related libraries such as Pandas, Numpy, Scikit-Learn, Keras, PyTorch, Tensorflow, Matplotlib, Seaborn, Bokeh, and Plotly, etc.
- Knowledge in academic courses such as Probability, Statistics, Machine Learning, Natural Language Processing, Deep Learning, Data Mining, Computer Vision, Big Data, Business Intelligence, Information Retrieval, etc.
- Solid knowledge of Statistics and Probability concepts like A/B testing, ANOVA, and Hypothesis testing, etc.
- Solid knowledge of Machine Learning (classification and regression) algorithms such as Decision Trees, Support Vector Machine, Random Forest, Logistic Regression, AdaBoost, Neural Networks, and XGBoost. Able to work with unbalanced data sets, with structured and unstructured data.
- Experience and knowledge modeling time series in predictive analysis.
- Experience and solid knowledge working with large datasets using cloud services (Amazon Web Services, Google Cloud Platform, or Azure).
Preferred Education & Experience
- Experience with other programming languages such as Java, R, Matlab.
- Experience with Microsoft Office, Tableau, Hadoop, MongoDB, Hive, Spark.
- Good to have knowledge in state-of-the-art NLP language models like BERT, XLNET, and GPT, etc.
- Good to have educational or work experience in Bio-Med and/or drug discovery