Scholarly open access journals, Peer-reviewed, and Refereed Journals, Impact factor 8.14 (Calculate by google scholar and Semantic Scholar | AI-Powered Research Tool) , Multidisciplinary, Monthly, Indexing in all major database & Metadata, Citation Generator, Digital Object Identifier(DOI)
Now a days “Big data” is became part of the day. It is a collection of data sets so large and complex that it becomes difficult to process, manage, and analyse data in a just-in-time manner. The rapid growth in the amount of data created in the world continues to accelerate and surprise us. Moreover, big data is more complicated to handle efficiently.
In a data stream management system (DSMS), multi-task in a service and multi-instance in a task are commonly used. In other words, a service is composed of input, output, and tasks: The input constantly reads new messages and emits them to the stream. The output is also written streams in the specified format (e.g., a file). A task is executed often in response to newly arrived stream from the input or previous task emission. When a task would interfere with processing in other tasks, the task must be necessary to run multiple copies of the source (i.e., a single task). So, a single task splits multi-instance where resides in physical instances.
Today, there are many big data processing frameworks to handle large volume of big data. In this paper, we will discuss in top level projects of the Apache Software Foundation (ASF), is a general-purpose data processing platform. They have a wide field of application and are usable for dozens of big data scenarios. Apache Spark is based on resilient distributed datasets(RDDs). This in-memory data structure gives the power to sparks functional programming paradigm. It is capable of big batch calculations by pinning memory. Especially, Spark Streaming wraps data streams into mini-batches. It collects all data that arrives within a certain period of time and runs a regular batch program on the collected data. While the batch program is running, the data for the next mini-batch is collected.
Keywords:
data, streaming,framework
Cite Article:
"Multi Instance Task in Data Stream Management System", International Journal for Research Trends and Innovation (www.ijrti.org), ISSN:2455-2631, Vol.7, Issue 9, page no.157 - 161, September-2022, Available :http://www.ijrti.org/papers/IJRTI2209020.pdf
Downloads:
000205219
ISSN:
2456-3315 | IMPACT FACTOR: 8.14 Calculated By Google Scholar| ESTD YEAR: 2016
An International Scholarly Open Access Journal, Peer-Reviewed, Refereed Journal Impact Factor 8.14 Calculate by Google Scholar and Semantic Scholar | AI-Powered Research Tool, Multidisciplinary, Monthly, Multilanguage Journal Indexing in All Major Database & Metadata, Citation Generator