Home » Business » John Snow Labs, Turi and Alderley.ai Partnership Showcases Turnkey Productivity of Data Science on a Massive Healthcare Dataset

Now publicly available: A tangible example of saving substantial time & money by combining clean, current & enriched datasets with a highly optimized, scalable data science toolset.

Lewes, Delaware, July 13, 2016, It is a well-known fact that data scientists spend 50%-80% of their time today preparing data for analysis. John Snow Labs is a data operations company that solves for this challenge, focusing on the healthcare domain. They serve software and data science companies who are looking to outsource data operations to a trusted specialist. John Snow Labs makes them faster by providing high quality, clean, updated and ready to use datasets to enable them to focus on what they do best.

Domain experts curate & enrich datasets for specific data science challenges, and deliver turnkey data that is high quality, always up to date, and inter-operable across different sources. It is consistently documented, so that users easily know what each field and data element means. And most fundamentally, it is just there, already formatted, optimized and loaded into the client’s analytics platform of choice.

A tangible example of saving time & money

The interactive, editable notebook detailing the step-by-step analysis is publicly available. Mark Pinches, a data scientist with years of expertise in pharma and the founder of Alderly.ai, built the IPython Notebook that was used to perform the analysis. It uses the US healthcare providers dataset from John Snow Labs to answer queries using slicing, joins, aggregations and visualizations. The dataset is tabular and has almost 5 million rows and over 300 columns. This analysis makes heavy use of GraphLab Create’s SFrame library, provided by Turi (formerly Dato) – a John Snow Labs data science partner, taking advantage of its optimizations & scalability for large out-of-memory datasets.

Mark Pinches works in the UK and Europe, using a combination of onsite work and remote access. His chosen tools are Qlikview for data wrangling and super quick application delivery; Python for machine learning including Graphlab and sci-kit learn, for rapidly scalable machine learning, with seaborne and matplotlib for visualisation. Mark has extensive experience in data modelling, analysis, and visualization, applied statistics and machine learning with significant study in drug development and toxicology. He has used everything from deep learning techniques to historical research to solve problems.

GraphLab Create is an extensible machine learning framework that enables developers and data scientists to easily build and deploy intelligent applications and services at scale. It includes distributed data structures and rich libraries for data transformation and manipulation, scalable task-oriented machine learning toolkits for creating, evaluating, and improving machine learning models, data and model visualization for all aspects of development. GraphLab Create™ is built on top of state-of-the-art technology in scalable data structures, powerful machine learning methods, intuitive visualization, and flexible deployment options. It is written in C++ for the best possible performance, with a Python interface for easy accessibility. The API, including auto-tuning for complex machine learning models, is designed to be easy to use for beginners, yet flexible enough for expert data scientists.

John Snow Labs is a data operations company that accelerates data science, analytics and software teams by providing turnkey data for analysis. The company’s team of medical and data specialists provides data across 15 broad categories, and takes away the ongoing pain many analytics projects experience to find, clean, enrich, update and publish referential data.

“High productivity data science rests on three pillars: Having clean, current & rich data so that one doesn’t spend 80% of their time preparing data for analysis; having fast, scalable & extensive machine learning libraries and tools; and having a domain expert at the helm to put them together. John Snow Labs is proud to partner with Turi and Alderley.ai to provide best-of-class solutions for each of these three pillars”, said the founding team.

For further information, visit: www.JohnSnowLabs.com

Please follow John Snow Labs:

Twitter: twitter.com/johnsnowlabs
LinkedIn: www.linkedin.com/company/johnsnowlabs
Facebook: www.facebook.com/JohnSnowLabsInc
Google+: plus.google.com/u/0/+Johnsnowlabs/posts

Media Contact:

John Snow Labs
Attn: Ida Lucente
16192 Coastal Highway
Lewes, DE 19958
+1 (302) 786-5227
ida@JohnSnowLabs.com

3-18-2016 5-05-13 PM