2.50
Hdl Handle:
http://hdl.handle.net/10755/622112
Category:
Full-text
Format:
Text-based Document
Type:
Presentation
Level of Evidence:
N/A
Research Approach:
N/A
Title:
Analytical Challenges in the Era of Big Data
Other Titles:
Use of Big Data to Influence Nursing Care
Author(s):
Jeffery, Alvin D.
Lead Author STTI Affiliation:
Iota
Author Details:
Alvin D. Jeffery, PhD, RN-BC, CCRN-K, FNP-BC, Professional Experience: 2013-present -- PhD Student, Vanderbilt University, Nashville, TN 2014-present -- Quality Scholar Nurse Fellow, U.S. Department of Veterans Affairs, Nashville, TN 2016-present -- Nursing Performance Measurement Consultant, Hospital Corporation of America, Nashville, TN Has spoken (nationally and internationally) and published on a variety of nursing professional development and leadership topics. Author Summary: Alvin Jeffery is a post-doc fellow in Medical Informatics with the U.S. Department of Veterans Affairs in Nashville, TN. He focuses on research and quality improvement activities surrounding nurses' delivery of care. Alvin is a recent PhD graduate from Vanderbilt University’s School of Nursing, and he is building a program of research focused on the design, development, and evaluation of probability-based clinical decision support tools.
Abstract:

Purpose: The popularity of “big data” along with an increasing capacity for real-time predictive analytics holds significant promise for nurses and other clinicians to gain new insights and develop novel decision support tools from our large clinical datasets. Unfortunately, these large datasets are not the panacea that some big data proponents would taut. For nurses with vast subject matter expertise in a clinical area who desire to leverage big data for solving practical problems, road blocks quickly surface in the form of acquisition and management of data, missing data, meeting assumptions of statistical models, and model evaluation for statistical and clinical performance. This talk will engage the audience in addressing these issues using an exemplar of the development of a prediction model for in-hospital cardiopulmonary arrest.

Methods: The following 4 topics will be addressed:

Data Acquisition and Management: From ethics approval to ensuring individual patient privacy to preventing undesired user access, collecting and storing “big data” is no simple task. The presenter will provide: (a) an overview of key concepts, (b) an exemplar for constructing a data acquisition and management team, and (c) several resources for learning more independently.

Missing Data: Almost all large datasets contain some amount of missing data. Regardless of the amount, finding the cause of missingness is of paramount importance. Approaches to determining a cause will be introduced, and disadvantages of complete case analysis will be described. Advantages and disadvantages of median imputation, multiple imputation, and machine learning imputation will be compared.

Statistical Model Assumptions: There are a variety of statistical models available, and with recent advances in machine learning methods, more approaches to retrieve information from the data are available to a wide array of users. An overview of the purpose and requirements of traditional modeling (e.g., logistic and linear regression) and machine learning approaches (e.g., random forests and cluster analyses) will be provided.

Model Evaluation: Determining how well a model performs on the current data and how well it is expected to perform on future data is essential in determining whether or not the model is helpful for clinical care. Internal (e.g., bootstrapping and cross-validation) versus external validation (e.g., split sample and chronological validation) techniques will be presented along with their respective advantages and disadvantages.

Results: Our in-hospital cardiopulmonary arrest prediction model required a team-based approach to solving the aforementioned challenges, and the audience will hear not only how we chose to solve the problems but also other approaches we considered. From the perspective of data acquisition/management, we found the best approach to be the inclusion of database and informatics specialists who used structured query language to extract the relevant data and then store it on a secure, organizational server. Following a simulation study, we discovered the missing data problem was best resolved by creating a multiple imputation model that included the outcome variable. Statistical model assumptions were best met by not assuming linearity while not permitting too many spline knots. Model evaluation comprised internal bootstrap validation for the regression models and split-sample validation for the machine learning methods.

Conclusion: Arriving at clinically meaningful insights contained within large datasets requires multifaceted expertise and teamwork. Nurses and other clinicians are the best members of the team to identify a problem that “big data” can help solve. To ensure a clinically meaningful solution surfaces from big data efforts, nurses should be aware of common challenges in big data research. As nurses become more knowledgeable, they position themselves to be leaders in these research teams and advocates for implementation of novel findings.

Keywords:
big data; clinical decision support; quantitative methods
Repository Posting Date:
25-Jul-2017
Date of Publication:
25-Jul-2017
Other Identifiers:
INRC17H15
Conference Date:
2017
Conference Name:
28th International Nursing Research Congress
Conference Host:
Sigma Theta Tau International
Conference Location:
Dublin, Ireland
Description:
Event Theme: Influencing Global Health Through the Advancement of Nursing Scholarship

Full metadata record

DC FieldValue Language
dc.language.isoen_USen
dc.type.categoryFull-texten
dc.formatText-based Documenten
dc.typePresentationen
dc.evidence.levelN/Aen
dc.research.approachN/Aen
dc.titleAnalytical Challenges in the Era of Big Dataen_US
dc.title.alternativeUse of Big Data to Influence Nursing Careen
dc.contributor.authorJeffery, Alvin D.en
dc.contributor.departmentIotaen
dc.author.detailsAlvin D. Jeffery, PhD, RN-BC, CCRN-K, FNP-BC, Professional Experience: 2013-present -- PhD Student, Vanderbilt University, Nashville, TN 2014-present -- Quality Scholar Nurse Fellow, U.S. Department of Veterans Affairs, Nashville, TN 2016-present -- Nursing Performance Measurement Consultant, Hospital Corporation of America, Nashville, TN Has spoken (nationally and internationally) and published on a variety of nursing professional development and leadership topics. Author Summary: Alvin Jeffery is a post-doc fellow in Medical Informatics with the U.S. Department of Veterans Affairs in Nashville, TN. He focuses on research and quality improvement activities surrounding nurses' delivery of care. Alvin is a recent PhD graduate from Vanderbilt University’s School of Nursing, and he is building a program of research focused on the design, development, and evaluation of probability-based clinical decision support tools.en
dc.identifier.urihttp://hdl.handle.net/10755/622112-
dc.description.abstract<p><strong>Purpose: </strong><span>The popularity of “big data” along with an increasing capacity for real-time predictive analytics holds significant promise for nurses and other clinicians to gain new insights and develop novel decision support tools from our large clinical datasets. Unfortunately, these large datasets are not the panacea that some big data proponents would taut. For nurses with vast subject matter expertise in a clinical area who desire to leverage big data for solving practical problems, road blocks quickly surface in the form of acquisition and management of data, missing data, meeting assumptions of statistical models, and model evaluation for statistical and clinical performance. This talk will engage the audience in addressing these issues using an exemplar of the development of a prediction model for in-hospital cardiopulmonary arrest.</span></p> <p><strong>Methods: </strong>The following 4 topics will be addressed:</p> <p><strong>Data Acquisition and Management: </strong>From ethics approval to ensuring individual patient privacy to preventing undesired user access, collecting and storing “big data” is no simple task. The presenter will provide: (a) an overview of key concepts, (b) an exemplar for constructing a data acquisition and management team, and (c) several resources for learning more independently.</p> <p><strong>Missing Data: </strong>Almost all large datasets contain some amount of missing data. Regardless of the amount, finding the cause of missingness is of paramount importance. Approaches to determining a cause will be introduced, and disadvantages of complete case analysis will be described. Advantages and disadvantages of median imputation, multiple imputation, and machine learning imputation will be compared.</p> <p><strong>Statistical Model Assumptions: </strong>There are a variety of statistical models available, and with recent advances in machine learning methods, more approaches to retrieve information from the data are available to a wide array of users. An overview of the purpose and requirements of traditional modeling (e.g., logistic and linear regression) and machine learning approaches (e.g., random forests and cluster analyses) will be provided.</p> <p><strong>Model Evaluation: </strong>Determining how well a model performs on the current data and how well it is expected to perform on future data is essential in determining whether or not the model is helpful for clinical care. Internal (e.g., bootstrapping and cross-validation) versus external validation (e.g., split sample and chronological validation) techniques will be presented along with their respective advantages and disadvantages.</p> <p><strong>Results: </strong>Our in-hospital cardiopulmonary arrest prediction model required a team-based approach to solving the aforementioned challenges, and the audience will hear not only how we chose to solve the problems but also other approaches we considered. From the perspective of data acquisition/management, we found the best approach to be the inclusion of database and informatics specialists who used structured query language to extract the relevant data and then store it on a secure, organizational server. Following a simulation study, we discovered the missing data problem was best resolved by creating a multiple imputation model that included the outcome variable. Statistical model assumptions were best met by not assuming linearity while not permitting too many spline knots. Model evaluation comprised internal bootstrap validation for the regression models and split-sample validation for the machine learning methods.</p> <p><strong>Conclusion: </strong>Arriving at clinically meaningful insights contained within large datasets requires multifaceted expertise and teamwork. Nurses and other clinicians are the best members of the team to identify a problem that “big data” can help solve. To ensure a clinically meaningful solution surfaces from big data efforts, nurses should be aware of common challenges in big data research. As nurses become more knowledgeable, they position themselves to be leaders in these research teams and advocates for implementation of novel findings.</p>en
dc.subjectbig dataen
dc.subjectclinical decision supporten
dc.subjectquantitative methodsen
dc.date.available2017-07-25T15:18:25Z-
dc.date.issued2017-07-25-
dc.date.accessioned2017-07-25T15:18:25Z-
dc.conference.date2017en
dc.conference.name28th International Nursing Research Congressen
dc.conference.hostSigma Theta Tau Internationalen
dc.conference.locationDublin, Irelanden
dc.descriptionEvent Theme: Influencing Global Health Through the Advancement of Nursing Scholarshipen
All Items in this repository are protected by copyright, with all rights reserved, unless otherwise indicated.