AUTOMATE DATA SCIENCE WORKFLOWS USING DATA ENGINEERING TECHNIQUES

Authors

  • Naresh Babu Kilaru Independent Researcher

DOI:

https://doi.org/10.36676/jrps.v12.i3.1543

Keywords:

Data Science Orchestration, Data Preparation and Processing, Process Management, Feature Creation, Data Streams

Abstract

This assignment focuses on applying data engineering practices in data science, aiming to improve the speed, size, and reproducibility of data-driven tasks. The paper explores using WMS to incorporate ADP and AFT when implementing the entire data science pipeline, from data acquisition to deployment of the final model. By analyzing simulation reports and real-life cases, this work showcases the effectiveness of automation in addressing issues including integration, time of processing, and reliance on manual efforts for enhancing decision-making and organizational processes. The main points suggest that using data engineering approaches saves time and resources while performing data pre-processing and analysis, improves the quality and reliability of analytics findings and outputs, and is an essential component of contemporary analytical pipelines.

 

References

Da Silva, R. F., Filgueira, R., Pietri, I., Jiang, M., Sakellariou, R., & Deelman, E. (2017). A characterization of workflow management systems for extreme-scale applications. Future Generation Computer Systems, 75, 228-238. https://www.sciencedirect.com/science/article/am/pii/S0167739X17302510 DOI: https://doi.org/10.1016/j.future.2017.02.026

Jangampeta, S., Mallreddy, S. R., & Padamati, J. R. (2021). Data Security: Safeguarding the Digital Lifeline in an Era of Growing Threats. International Journal for Innovative Engineering and Management Research, 10(4), 630-632.

Sukender Reddy Mallreddy(2020).Cloud Data Security: Identifying Challenges and Implementing Solutions.JournalforEducators,TeachersandTrainers,Vol.11(1).96 -102.

Vasa, Y. (2021). Develop Explainable AI (XAI) Solutions For Data Engineers. NVEO - Natural Volatiles & Essential Oils, 8(3), 425–432. https://doi.org/https://doi.org/10.53555/nveo.v8i3.5769 DOI: https://doi.org/10.53555/nveo.v8i3.5769

Singirikonda, P., Jaini, S., & Vasa, Y. (2021). Develop Solutions To Detect And Mitigate Data Quality Issues In ML Models. NVEO - Natural Volatiles & Essential Oils, 8(4), 16968–16973. https://doi.org/https://doi.org/10.53555/nveo.v8i4.5771 DOI: https://doi.org/10.53555/nveo.v8i4.5771

Vasa, Y., Jaini, S., & Singirikonda, P. (2021). Design Scalable Data Pipelines For Ai Applications. NVEO - Natural Volatiles & Essential Oils, 8(1), 215–221. https://doi.org/https://doi.org/10.53555/nveo.v8i1.5772 DOI: https://doi.org/10.53555/nveo.v8i1.5772

Katikireddi, P. M., Singirikonda, P., & Vasa, Y. (2021). Revolutionizing DEVOPS with Quantum Computing: Accelerating CI/CD pipelines through Advanced Computational Techniques. Innovative Research Thoughts, 7(2), 97–103. https://doi.org/10.36676/irt.v7.i2.1482 DOI: https://doi.org/10.36676/irt.v7.i2.1482

Nunnaguppala, L. S. C. , Sayyaparaju, K. K., & Padamati, J. R.. (2021). "Securing The Cloud: Automating Threat Detection with SIEM, Artificial Intelligence & Machine Learning", International Journal For Advanced Research In Science & Technology, Vol 11 No 3, 385-392

Padamati, J., Nunnaguppala, L., & Sayyaparaju, K. . (2021). "Evolving Beyond Patching: A Framework for Continuous Vulnerability Management", Journal for Educators, Teachers and Trainers, 12(2), 185-193.

Nunnaguppala, L. S. C. . (2021). "Leveraging AI In Cloud SIEM And SOAR: Real-World Applications For Enhancing SOC And IRT Effectiveness", International Journal for Innovative Engineering and Management Research,10(08), 376-393

Downloads

Published

30-07-2021

How to Cite

Naresh Babu Kilaru. (2021). AUTOMATE DATA SCIENCE WORKFLOWS USING DATA ENGINEERING TECHNIQUES. International Journal for Research Publication and Seminar, 12(3), 521–530. https://doi.org/10.36676/jrps.v12.i3.1543