SAN FRANCISCO, CA — (Marketwired) — 06/19/13 — , the enterprise Big Data application platform company, today announced that Paco Nathan, director of Data Science, will deliver a talk, titled ” — an open source project for migrating predictive models from SAS, R, Microstrategy®, etc., onto Hadoop” at the 6th Annual North America, taking place June 26-27, 2013 in San Jose, Calif. This two-day event will feature Apache Hadoop thought leaders who will showcase successful Hadoop use cases, share development tips and tricks and educate organizations about how to best leverage Apache Hadoop as a key constituent in their enterprise data architecture.
“Pattern — an open source project for migrating predictive models from SAS, R, Microstrategy®, etc., onto Hadoop” speaking session at Hadoop Summit
Paco Nathan, director of Data Science of Concurrent, Inc., the company behind the application framework
Wednesday, June 26 at 4:55 p.m. PDT
San Jose Convention Center
Register at
Pattern is a free, open source project, which takes models trained in popular analytics tools, such as SAS®, Microstrategy®, R and SQL Server, and runs them at scale on Apache Hadoop. This machine-learning library, based on the popular Cascading framework, works by translating PMML into data workflows and can be quickly deployed on your Apache Hadoop data. PMML models can be run in a pre-defined JAR file with no coding required and can also be combined with other flows based on ANSI SQL (), Scala (Scalding) and Clojure (Cascalog) to meet enterprise requirements. Benefits include greatly reduced development costs and less licensing issues at scale, while leveraging a combination of Apache Hadoop clusters, existing intellectual property in predictive models, and the core competencies of analytics staff. Sample code in this talk will show apps using predictive models built in SAS and R. In addition, examples will show how to compare variations of models for large-scale customer experiments. Portions of this material come from the O-Reilly book “Enterprise Data Workflows with Cascading,” publishing on July 10, 2013.
Paco Nathan is director of Data Science at Concurrent, Inc., where he leads the company-s developer outreach program. He has a dual background from Stanford in math/statistics and distributed computing, and has more than 25 years experience in the technology industry. Nathan is an expert in Hadoop, R, predictive analytics, machine learning and natural language processing.
Cascading website:
Pattern website:
Company:
Contact Us:
Follow us on Twitter:
Concurrent, Inc.-s vision is to become the #1 software platform choice for Big Data applications. Concurrent builds application infrastructure products that are designed to help enterprises create, deploy, run and manage data processing applications at scale on Apache Hadoop.
Concurrent is the mind behind Cascading, the most widely used and deployed technology for Big Data applications with more than 75,000+ user downloads a month. Used by thousands of data driven businesses including Twitter, eBay, The Climate Corp, and Etsy, Cascading is the de-facto standard in open source application infrastructure technology. Concurrent is headquartered in San Francisco. Visit Concurrent online at .
Danielle Salvato-Earl
Kulesa Faul for Concurrent, Inc.
(650) 340 1982
You must be logged in to post a comment Login