A baby enters the world. The baby has five senses, with which his/her brain gains information. A huge amount of random data enters his/her brain.
The baby tries to stand up, and falls, and repeats this process several times, after which he/she learns not just to stand up, but also to walk. The baby gathered the data, analysed the data, and finally, learnt from the data.
This is data analysis.
Let’s jump into Big Data. For this, we’ll need to consider the digital world.
For helping you to understand this successfully, I need you to free your mind of all concerns regarding privacy issues, and read on till the end of this section.
You Google. You search for a place on Google Maps. You set your alarm at one particular time of the day. You add reminders on your phone. You do incognito stuff. You date. You access social networks.
You live in the digital world.
And in the digital world, you leave a trail of breadcrumbs, from the very moment you start living in it. From the moment you opened a Google account, or a Facebook account, or even opened the web browser from a computer or a mobile device, you have been alive in the digital world.
Digital information is recorded, constantly.
However, this data is being obtained from the entire population of the globe. It is unrelated, random, unorganized, and its size is enormous.
In 2010, Eric Schmidt, Google’s executive chairman gave us the following statistical information:
Every two days now, we create as much information as we did from the dawn of civilization up until 2003, according to Schmidt. That’s something like five exabytes of data, he says.
Let me repeat that: we create as much information in two days now, as we did from the dawn of man through 2003.
So, how do we handle this amazing resource?
Big Data comes in at this point.
Googling Big Data, gives:
extremely large data sets that may be analysed computationally to reveal patterns, trends, and associations, especially relating to human behaviour and interactions.
Now, what matters to the world is the end result (of the analysis of data).
- An immense rise in the efficiency of the police force
- Foiling terrorist attacks
- Streamlining marketing efforts
- Predicting human behaviour
- Artificial intelligence (gradually this will come out of science fiction, into the real world)
These are only the few, of the top of my head. There are infinite applications of Big Data. Here’s a video that might help you to understand this information better:
What is Big Data?
This part is a bit more technical than what you have read/watched so far. You can skip over to the next section if you wish to.
Big Data has three dimensions:
- Volume: This is the fundamental attribute of Big Data. Numerous devices feed data into numerous servers. For example, 12 TB of Tweeter information needs to be processed everyday, while Facebook digests 500 TB of data!
- Velocity: Data becomes obsolete extremely quickly, and large amounts need to be processed in real time. For example, clickstreams capture user behavior at millions of events per second!
- Variety: This is what is listed in the previous section. There is structured data and there is unstructured data. Unstructured data includes text, audio, video, sensor data, geospatial data, logs, and everything else! To predict someone’s future actions, both types of data need to be analysed together.
Requirements of Big Data processing
The term ‘Big Data‘ is ambiguous. Consider the decade 1990-2000. Any data that could not be handled with MS-Excel required Unix workstations. Nowadays, data that cannot be placed within a relational database, and analyzed with a statistics package on a desktop, requires parallel software running on numerous servers.
Hence, ‘Big’ is a term that refers to data size which forces us to look beyond the easily available resources.
Considering ‘our’ time, we need resources like DRAM, leverage flash memory, parallel processing (high-bandwidth, low-latency interconnects between processors, memory and flash technologies), and making logical copies of data.
Applications of Big Data
First, let me tell you about some past applications of Big Data:
BJP won the Indian General Election (of 2014.) Big Data analysis helped.
Barack Obama got re-elected in 2014. Big Data played a big role.
Familiar with Netflix? Traffic patterns help improve the reliability of video streaming, and its recommendation engine works with the help of Hadoop data processing platform.
Weather predictions can be made with Big Data. WeatherSignal is an app that works by re-purposing the sensors in Android devices to map atmospheric readings.
The Large Hadron Collider requires Big Data analysis.
Internet of Things
Big Data analysis is the basis of artificial intelligence. Currently, it can be used to find out patterns in an individual’s activities.
Remember ‘The Jetsons‘?
There’s one element of that TV series that I need you to remember: the automated home.
You wake up with your alarm. A conveyor belt takes you to your shower, which turns on automatically. As it turns off, the coffee pot and the toaster do their job. Then you go to your car (which is cleaned), you turn on the ignition, and leave for office. As you start your car, the house gets locked, and all unnecessary electrical appliances are turned off.
Basically, the household appliances communicate with each other.
Sounds futuristic? Internet of things (IoT) can turn this into reality.
What is IoT?
The Internet of Things (IoT) is a scenario in which objects, animals or people are provided with unique identifiers and the ability to transfer data over a network without requiring human-to-human or human-to-computer interaction.(from Whatis.com)
- The Internet of Things allows objects to be sensed and controlled remotely across existing network infrastructure(from Wikipedia)
IoT needs data to analyze human patterns. Big Data analysis provides the data.
IoT and Big Data analysis work together. It’s the union of people, data, appliances, and predictive (intelligent) algorithms.
The Data Science Foundation and NASSCOM are going to organize the Second International Data Science Summit, on the 28th of August in Kolkata.
Big Data is a growing buzzword. And it will play a major role in defining the future.
This summit will give you knowledge on not only Big Data, but several other topics such as machine learning, sentiment analysis, and predictive modelling. Be a part of this summit.