System that replaces human intuition with algorithms outperforms 615 of 906 human teams
Big-data analysis consists of searching for buried patterns that have some kind of predictive power. But choosing which “features” of the data to analyze usually requires some human intuition. In a database containing, say, the beginning and end dates of various sales promotions and weekly profits, the crucial data may not be the dates themselves but the spans between them, or not the total profits but the averages across those spans.
MIT researchers aim to take the human element out of big-data analysis, with a new system that not only searches for patterns but designs the feature set, too. To test the first prototype of their system, they enrolled it in three data science competitions, in which it competed against human teams to find predictive patterns in unfamiliar data sets. Of the 906 teams participating in the three competitions, the researchers’ “Data Science Machine” finished ahead of 615.
In two of the three competitions, the predictions made by the Data Science Machine were 94 percent and 96 percent as accurate as the winning submissions. In the third, the figure was a more modest 87 percent. But where the teams of humans typically labored over their prediction algorithms for months, the Data Science Machine took somewhere between two and 12 hours to produce each of its entries.
“We view the Data Science Machine as a natural complement to human intelligence,” says Max Kanter, whose MIT master’s thesis in computer science is the basis of the Data Science Machine. “There’s so much data out there to be analyzed. And right now it’s just sitting there not doing anything. So maybe we can come up with a solution that will at least get us started on it, at least get us moving.”
Between the lines
Kanter and his thesis advisor, Kalyan Veeramachaneni, a research scientist at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), describe the Data Science Machine in a paper that Kanter will present next week at the IEEE International Conference on Data Science and Advanced Analytics.
Veeramachaneni co-leads the Anyscale Learning for All group at CSAIL, which applies machine-learning techniques to practical problems in big-data analysis, such as determining the power-generation capacity of wind-farm sites or predicting which students are at risk for dropping out of online courses.
“What we observed from our experience solving a number of data science problems for industry is that one of the very critical steps is called feature engineering,” Veeramachaneni says. “The first thing you have to do is identify what variables to extract from the database or compose, and for that, you have to come up with a lot of ideas.”
In predicting dropout, for instance, two crucial indicators proved to be how long before a deadline a student begins working on a problem set and how much time the student spends on the course website relative to his or her classmates. MIT’s online-learning platform MITx doesn’t record either of those statistics, but it does collect data from which they can be inferred.
Read more: Automating big-data analysis
The Latest on: Big-data analysis
[google_news title=”” keyword=”big-data analysis” num_posts=”10″ blurb_length=”0″ show_thumb=”left”]
via Google News
The Latest on: Big-data analysis
- Which Data Security Strategies Safeguard Businesses Against Potential Threats?on April 26, 2024 at 3:17 pm
Which Data Security Strategies Safeguard Businesses Against Potential Threats? In the age of Big Data, safeguarding sensitive information is a top priority for businesses. We’ve gathered insights from ...
- Stock Market Today: Markets Soar Amid Strong Earnings for Big Techon April 26, 2024 at 1:07 pm
At the closing bell, the blue-chip Dow Jones Industrial Average was up 0.4% at 38,239, while the broader S&P 500 rose 1% to 5,099. The tech-heavy Nasdaq Composite soared 2% to 15,927.
- Will Palantir Stock Head to the Moon or Come Crashing Down to Earth?on April 26, 2024 at 7:10 am
AI investors have certainly done well to hold Palantir during this recent market surge, but can PLTR stock head even higher from here? More From InvestorPlace The #1 AI Investment Might Be This ...
- FEATURE: Using big data and analytics to enhance reverse logistics operationson April 24, 2024 at 7:53 am
FEATURE: Using big data and analytics to enhance reverse logistics operations - read more about this with Parcel and Postal Technology International ...
- Analysis: Congress’ first tech crackdown in years is a gift to Big Techon April 24, 2024 at 6:26 am
After years of congressional grillings and grandstanding aimed at tech giants, it’s striking that the U.S. government’s first major legislative crackdown on social media is essentially a gift to ...
- Brian Big Idea On Technical Analysison April 23, 2024 at 1:43 pm
As the aggressive growth stock strategist at Zacks Investment Research I tend to focus on the fundamentals. Specifically I look for stocks that have good growth and increasing margins and often times ...
- The Coolest Big Data System And Cloud Platform Companies Of The 2024 Big Data 100on April 23, 2024 at 7:41 am
And long-established software giants like Microsoft, Oracle and SAP provide foundational cloud systems, databases and other supporting software for big data initiatives, in addition to offering their ...
- How Trilliant Health is making its mark in the analytics marketon April 23, 2024 at 4:00 am
"Our goal is to make strategic analytics available to any stakeholder in the health economy," said Trilliant Health CEO Hal Andrews.
- The Coolest Data Analytics Companies Of The 2024 Big Data 100on April 22, 2024 at 7:29 am
This week CRN is running the Big Data 100 list in a series of slide shows, organized by technology category, spotlighting vendors of data analytics software, database systems, data warehouse and data ...
- The Key to Success: Harnessing the Power of Big Data for Business Growthon April 19, 2024 at 4:12 am
In today’s fast-paced and competitive business world, staying ahead of the curve is essential for success. And one way companies are doing just that is by harnessing the power of big data. From ...
via Bing News