Researchers at Columbia University, Princeton and Harvard University have developed a new approach for analyzing big data that can drastically improve the ability to make accurate predictions about medicine, complex diseases, social science phenomena, and other issues.
In a study published in the December 13 issue of Proceedings of the National Academy of Sciences (PNAS), the authors introduce the Influence score, or “I-score,” as a statistic correlated with how much variables inherently can predict, or “predictivity”, which can consequently be used to identify highly predictive variables.
“In our last paper, we showed that significant variables may not necessarily be predictive, and that good predictors may not appear statistically significant,” said principal investigator Shaw-Hwa Lo, a professor of statistics at Columbia University. “This left us with an important question: how can we find highly predictive variables then, if not through a guideline of statistical significance? In this article, we provide a theoretical framework from which to design good measures of prediction in general. Importantly, we introduce a variable set’s predictivity as a new parameter of interest to estimate, and provide the I-score as a candidate statistic to estimate variable set predictivity.”
Current approaches to prediction generally include using a significance-based criterion for evaluating variables to use in models and evaluating variables and models simultaneously for prediction using cross-validation or independent test data.
“Using the I-score prediction framework allows us to define a novel measure of predictivity based on observed data, which in turn enables assessing variable sets for, preferably high, predictivity,” Lo said, adding that, while intuitively obvious, not enough attention has been paid to the consideration of predictivity as a parameter of interest to estimate. Motivated by the needs of current genome-wide association studies (GWAS), the study authors provide such a discussion.
In the paper, the authors describe the predictivity for a variable set and show that a simple sample estimation of predictivity directly does not provide usable information for the prediction-oriented researcher. They go on to demonstrate that the I-score can be used to compute a measure that asymptotically approaches predictivity. The I-score can effectively differentiate between noisy and predictive variables, Lo explained, making it helpful in variable selection. A further benefit is that while usual approaches require heavy use of cross-validation data or testing data to evaluate the predictors, the I-score approach does not rely as much on this as much.
“We offer simulations and an application of the I-score on real data to demonstrate the statistic’s predictive performance on sample data,” he said. “These show that the I-score can capture highly predictive variable sets, estimates a lower bound for the theoretical correct prediction rate, and correlates well with the out of sample correct rate. We suggest that using the I-score method can aid in finding variable sets with promising prediction rates, however, further research in the avenue of sample-based measures of predictivity is needed.”
The authors conclude that there are many applications for which using the I-score would be useful, for example in formulating predictions about diseases with high dimensional data, such as gene datasets, in the social sciences for text prediction or financial markets predictions; in terrorism, civil war, elections and financial markets.
“We’re hoping to impress upon the scientific community the notion that for those of us who might be interested in predicting an outcome of interest, possibly with rather complex or high dimensional data, we might gain by reconsidering the question as one of how to search for highly predictive variables (or variable sets) and using statistics that measure predictivity to help us identify those variables to then predict well,” Lo said. “For statisticians in particular, we’re hoping this opens up a new field of work that would focus on designing new statistics that measure predictivity.”
Learn more:Â Researchers develop new approach for better big data prediction
[osd_subscribe categories=’prediction-tool’ placeholder=’Email Address’ button_text=’Subscribe Now for any new posts on the topic “PREDICTION TOOL”‘]
Receive an email update when we add a new PREDICTION TOOLÂ article.
The Latest on: Accurate predictions
[google_news title=”” keyword=”accurate predictions” num_posts=”10″ blurb_length=”0″ show_thumb=”left”]
via Google News
The Latest on: Accurate predictions
- Canelo vs Munguia: Fans ask AI to pick a winner, makes unexpected prediction about the fighton April 26, 2024 at 10:59 pm
Canelo to challenge for undisputed super middleweight title against undefeated Munguia.Canelo Alvarez will have a tough fight against Tijuana's Jaime Munguia on May 4th at the T-Mobile Arena in Las ...
- Daily Rashifal: Know most accurate prediction of your zodiac sign on 27 April 2024on April 26, 2024 at 6:40 pm
Today i.e. on 27th April 2024, in the special episode of Astrology Guru, know the most accurate prediction of your zodiac sign from Acharya Shiromani Sachin. Watch video on Zee News ...
- Cavaliers vs. Magic Game 4 prediction: Odds, picks, props and Bet365 bonus code for NBA playoffson April 26, 2024 at 3:00 pm
The Cleveland Cavaliers will lock horns with the Orlando Magic in an NBA matchup on Saturday. This prediction is based on 10,000 simulations of the game.
- Eminem fans left mindblown by woman's scarily accurate album prediction made back in 2021on April 26, 2024 at 9:55 am
While many fans worry this could be the final Eminem album and others still get over the surprise of the announcement, others have been left mindblown for a different reason. The rapper's fans are in ...
- High-precision blood glucose level prediction achieved by few-molecule reservoir computingon April 26, 2024 at 9:24 am
A collaborative research team from NIMS and Tokyo University of Science has successfully developed an artificial intelligence (AI) device that executes brain-like information processing through ...
- AI deciphers new gene regulatory code in plants and makes accurate predictions for newly sequenced genomeson April 26, 2024 at 9:05 am
Genome sequencing technology provides thousands of new plant genomes annually. In agriculture, researchers merge this genomic information with observational data (measuring various plant traits) to ...
- Cavaliers vs. Magic Game 3 prediction: NBA odds, picks, props and Bet365 bonus code for Thursdayon April 25, 2024 at 5:30 am
The Cleveland Cavaliers will take on the Orlando Magic in an NBA matchup on Thursday. This prediction includes our best bet of the game.
- Farmers' Almanac predicts 'sizzling' summer in Texas. How accurate is it?on April 25, 2024 at 4:08 am
Farmers' Almanac is an annual American periodical that has been in continuous publication since 1818, providing long-range weather predictions for the U.S. and Canada. The almanac states that its ...
- AI Weather Forecasting Platform Excarta Announces New Models to Expand Capabilities, Provide More Accurate Predictions to Commercial Customerson April 24, 2024 at 6:00 am
Excarta announced the release of several new models, allowing the AI-driven weather forecasting platform to provide higher-resolution data-driven forecasts. Compared to traditional models, Excarta has ...
via Bing News