FACEBOOK DATA ANALYSIS USING HADOOP AND SENTIMENT ANALYSIS ON COMMENTS
Keywords:
Big data technology, Hadoop, MapReduce, FacebookAbstract
With its simplified concept reflecting a vast amount of complex data that exceeds the capacity of traditional software and computer mechanisms to hold, process, and distribute, the data found in the world wide web represents a significant stage in the evolution of information and communication systems, prompting the development of advanced alternative techniques that allow monitoring and oversight of their flow. Internet site data, sensor data, and social network data can all be analyzed with the help of big data technology. This is because the analysis of such data enables links between a set of independent data to detect many aspects, including the prediction of commercial trends for businesses and the prevention of crime in the security sector, among others. These forecasts also provide decision-makers with novel resources for gaining a deeper insight into the situation at hand and, ultimately, for making the best possible choices that will lead to the successful realization of their objectives.
In its most basic form, sentiment analysis consists of identifying whether a given section of text is optimistic, pessimistic, or neutral. The system uses a combination of NLP and Deep Learning to locate and pull-out expert commentary from the text. There are now a wide variety of real-world uses for sentiment analysis. The most important related work in this field will be discussed, which has made many improvements to field of SA, and so are the challenges that hinder the process of sentiment analysis in light of this huge explosion of data and the rapid development in all fields of science, collective and economic, many important methods and techniques have emerged to deal with the data, which has become very large these days.
It's hard to see today's society functioning without the ubiquitous presence of social media on smartphones. The widespread adoption of smartphones and the subsequent proliferation of social media has had a profound impact on people's daily routines. Several social networking sites, like Facebook, Twitter, etc., are available. According to data from 2017, Facebook has close to 1.37 billion active users per day. Each user adds information, which may be organized, semi-structured, or completely unstructured. In order to turn a profit, company owners analyse this information to better cater to their clients' wants and anticipate their needs. Collecting information from Facebook, processing it, and presenting the findings visually is known as Facebook data analysis.
Facebook users' activity is mined for information. The database server keeps track of things like user activity, the number of likes, the number of posts, the content of posts, comments, and so on. Data in organized and semi-structured forms, user comments in unstructured ones. Facebook users create petabytes of data every day. Hence, Hadoop, MapReduce, and other associated big data ideas were used in this project for the purpose of data analysis.
Organizations get a competitive advantage when they are able to operate more quickly and more efficiently than their rivals. Through our project we intend to carry out analysis on a preferably large dataset using Hadoop, Map reduce and Hive. And classify the sentiments of comments present in data using LSTM.
References
Medhat, Walaa, Ahmed Hassan, and Hoda Korashy. "Sentiment analysis algorithms and applications: A survey." Ain Shams engineering journal 5.4 (2014): 1093-1113.
Baylis, Patrick, et al. "Weather impacts expressed sentiment." PloS one 13.4 (2018): e0195750.
Sailunaz, Kashfia, and Reda Alhajj. "Emotion and sentiment analysis from Twitter text." Journal of Computational Science 36 (2019): 101003.
Sandeep Bhargava et.al. 2019, “Performance Comparison of Big Data Analytics Platforms”, [online] Available at: “https://www.researchgate.net/publication/336305254”
Kamalpreet Singh et al, 2021, “Hadoop: Addressing challenges of Big Data”, [online] Available at: “https://ieeexplore.ieee.org/document/6779407”.
Bansal, G., 2014, “A Framework for Performance Analysis and Tuning in Hadoop Based Clusters”, [online] Iiitd.edu.in. Available at: SAVITRIBAI PHULE PUNE UNIVERSITY 240 https://www.iiitd.edu.in/~spbda2014/papers/spbda2014_submission_4_GarvitBa nsal.pdf
Joshi N., 2017, “Top 5 sources of big data | Artificial Intelligence | Data Science |” [online] Allerin.com. Available at: “https://www.allerin.com/blog/top-5- sources-of-big-data”
Abaker, I. and Hashem, T., 2018, “MapReduce scheduling algorithms: a review” [online] Umpir.ump.edu.my.Availableat:“http://umpir.ump.edu.my/id/eprint/30281/1/MapReduce%20scheduling%20algorithms-%20a%20review.pdf”
Jiang, D., 2014. The performance of MapReduce: an in-depth study: Proceedings of the VLDB Endowment: Vol 3, No 1-2. [online] Dl.acm.org. Available at:https://dl.acm.org/doi/10.14778/1920841.1920903.
Downloads
Published
How to Cite
Issue
Section
License
This work is licensed under a Creative Commons Attribution 4.0 International License.
Re-users must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use. This license allows for redistribution, commercial and non-commercial, as long as the original work is properly credited.