What is Big Data?
“Big Data” describes large and complex datasets that are difficult to process using traditional data processing techniques.1 The advancement of computing power makes analysis of this data possible, which has led to increasing popularity of Big Data in recent years. Big Data might include structured, semi-structured, and unstructured data from multiple sources, such as social media posts, web logs, and sensors, which users can analyze for insights into behavior patterns of users or trends.1 The sizable volume of data produced results from advancement in information systems and technologies such as cloud computing and the Internet of Things (IoT).
Characteristics of Big Data
De Mauro et al., after a review of literature and an analysis of previous definitions, offer the following consensual definition of Big Data2:
“Information assets characterized by such a High Volume, Velocity, and Variety
to require specific Technology and Analytical Methods for its transformation into Value”
The five key characteristics of Big Data are the volume, velocity, variety, value, and the veracity of the data.3 Volume of data refers to the vast amounts of data produced. Velocity means the rate at which the data is generated and that it happens in real-time, and variety refers to the several types of data generated such as unstructured, semi-structured, and structured data. Value of data is its usefulness to organizations after it is processed and converted into insights or information that is actionable. Finally, veracity of data refers to data quality and reliability making it necessary to have ways of detecting and correcting any false, incorrect, or incomplete data.
What are the current applications of Big Data?
Big Data has seen applications in various industries and domains such as health care, media and entertainment, IoT, manufacturing, and government.4 Users can employ Big Data to improve organizational operations by understanding user behavior and trends. Data miners apply advanced computing processes, such as machine learning, to Big Data to find patterns for making inferences or predictions from unstructured or large data sets. People can use Big Data to predict future events or trends through predictive analytics using historical data combined with statistical modeling, data mining techniques, and machine learning. The amount of data available on health behavior is growing exponentially, partly due to the use of mobile applications for wearable technology and health, as well as data from social media. Highlighted below are some broad examples of applications of Big Data in the humanitarian and development field.
Disease Outbreak Surveillance
The rapid development in the analytic capacity for Big Data in recent years has helped equip emergency response teams with real-time tools that can monitor, contain, and potentially stop the spread of diseases through technologies such as geo-mapping healthcare data to help anticipate a disease outbreak.4
Personalized Medicine
In the healthcare industry, Big Data systems play a significant role in the advancement of personalized medicine and prescriptive analytics (the process of using data to determine an optimal course of action).4 Researchers study and analyze genetic data to establish the optimal course of treatment for each individual, minimize medication side effects, and predict future health risks, etc.,.4
Early Warning of Disasters and Response
In the wake of global and national disasters, Big Data has proven useful in the various phases of a disaster management cycle: mitigation, preparedness, response, and recovery. Analysis of Big Data also creates varied possibilities for visualizing, analyzing, and predicting natural disasters. It has the potential to shape disaster management strategies for reducing the suffering of humans and mitigating the economic impact on nations. For example, Descartes Labs builds data analysis algorithms to leverage large datasets from NASA and European Space Agency satellite imagery to predict floods and hurricanes as a side job. These deep learning algorithms process images of the whole planet, taken by satellites every five minutes, so they can detect the eye of a hurricane and track its flooding path.5
SBC and Big Data
As social and behavior change (SBC) practitioners know, one of the main influences on both health and wellness as well as mortality and morbidity is human behavior.6 Big Data and analytics can help SBC practitioners understand the behaviors of their target audience, influencing factors for the practice of behaviors, and provide insights on audience segments. All of this can guide the design of interventions and message development. Organizations, including Global Pulse United Nations, provide guidelines for evaluators, evaluation and program managers, policymakers, and funding agencies on how to take advantage of the rapidly emerging field of Big Data in the design and implementation of systems for monitoring and evaluating development programs.7 In another area of application, analysts can use data from digital products to make computer-based predictions about future trends and behaviors. Algorithms applied to the data can then uncover preferences, behaviors, and values, and provide insight into decision making.
Digital technology and data analytics advancements have created unmatched potential to assess and change health behavior, accelerating the ability of science to understand and contribute to better health behavior and health outcomes.6 For example, wearable devices can capture data surrounding the complexity and nuances of human behavior. Analysts can use such data to identify the convergence of variables that affect behavior at any given time and the internal evolution of behavior over time. This data may aid in translational science by allowing for the creation of individualized and timely models of intervention delivery and discovery science by exposing digital markers of health/risk behavior.6 Also, they could provide insight into the clinical trajectories of diagnosable illnesses across time and the diagnostic classification of clinically problematic behavior.
Big Data and SBC in Risk Communication and Surveillance
Big Data can play a crucial role in identifying rumors and misconceptions that spread through communities. Users can analyze Big Data from social media, radio broadcasts, and other sources of information so they can flag rumors and misconceptions. The findings of such analysis can inform the risk communication message design and development to counteract mis- or disinformation. The capacity to mine the digital traces of social media conversation offers an opportunity to use tools such as machine learning algorithms to track the flow of risk communication messages in the larger public communication sphere.
One timely example is the role Big Data has played in the fight against COVID-19 in the recent and ongoing pandemic. Public health personnel have used Big Data to track the spread of the virus, globally contributing to an understanding of its nature and creating an ability to forecast its impact in different areas and populations of the world. Analysis of large social media datasets has helped identify rumors and misinformation, leading to strategy development for managing the infodemic.
Ethical Considerations
With the potential benefits that Big Data has for social sciences and research, it raises numerous ethical challenges and risks that need to be considered in interventions that make use of it. Some of these concerns may relate to respecting the autonomy of patients in health care through informed consent and protecting data privacy for study participants. Although regulations that govern the use of human subjects in research exist, treating Big Data the same way as traditional research does not adequately address the concerns that Big Data poses.
This trending topic brings together a curated collection of resources to aid understanding of Big Data and its role in SBC. If you have related materials you would like to share with us, please upload the items, or contact us at info@thecompassforsbc.org.
References
- Oracle. (2023). What is Big Data?. https://www.oracle.com/big-data/what-is-big-data/
- De Mauro, A., Greco, M., & Grimaldi, M. (2014). What is big data? A consensual definition and a review of key research topics. 4th International Conference on Integrated Information. Madrid.
- Nguyen, T. L. (2018). A framework for five big V’s of big data and organizational culture in firms. IEEE International Conference on Big Data (pp. 5411–5413). Institute of Electrical and Electronics Engineers.
- Sunagar, P., Hanumantharaju, R., G. M., S., Kanavalli, A., & Srinivasa, K. G. (2020). Influence of big data in smart tourism. In S. Bhattacharyya, V. Snášel, D. Gupta, & A. Khanna, Hybrid Computational Intelligence Challenges and Applications (pp. 25-47). Academic Press. https://doi.org/10.1016/C2018-0-03259-4
- Cee, S. (2019, March). Using big data to predict natural disasters. DeepTechWire. http://deeptechwire.com/using-big-data-to-predict-natural-disasters
- Marsch, L. A. (2020). Digital health data-driven approaches to understand human behavior. Neuropsychopharmacology Reviews, pp. 191–196.
- Bamberger, M. (2016). Integrating Big Data into the Monitoring and Evaluation of Development Programmes. UN Global Pulse.