What is Big Data
You must have already heared the term Big Data. If not then chances are you will hear about this term very soon because this is the direction where the technology is going. So the question is what is big data? The name is very misleading because the term Big Data give you an impression that after a certain size the data is big and below a certain size the data is small. To measure the size of data, we deal with KB, MB, GB, TB, PT EB, ZB and even YB on day to day basis and if we go little heigher than that then we have petabytes exabytes, zettabytes and yura bytes and you can see here 10 to the power 15 is petabyte and so on.
So the question is from what point onwards the big data starts? The answer is it depends. The big data could start from any point. There is no definitive definition of big data. It is mostly defined this way that, big data is a data that becomes difficult to be processed because of it's size using traditional system just.
Just to put things in perspective let's say you have created a document of 100 megabyte and you want to share it with your colleagues and you are unable to send it via email so this becomes big data for you because you are unable to use traditional methods with this document because of it's large size.
Let's say you have an image file of 100 GB and you are unable to display it on your monitor in real time because of the size of this image file so this becomes big data for you in this context and let's say you have a video file of 100 TB and you are unable to edit it using your software so this video file becomes big data for you in this context. So the term big data is relative to the capabilities of the system, at a higher level the term is relative to the organization's.
Areas of challenges
Let's say a stream of data coming in two companies, company 1 and company 2. Let's say the data consists of different unstructured items like text audio video etc and 500 TB is coming in on daily basis. So this data set could be a big data for one company and not the other. It depends on the capabilities of the company. Here we are assuming that company 2 is all set to digest this volume of data, this variety of data and at this velocity, company 2 is not at per yet traditional systems including relational databases are not capable of handling the big data and challenges spring up at multiple levels including the capturing, curing, storing, analyzing, searching, sharing, transferring the data and even visualizing the data.
The Big data becomes a challenge for a traditional system not merely because of it's size, that could be a challenging point the size but challenge may also arise becsause of the speed at which that big data is coming in and also because it is unstructured and it could contain data items of various formats.
So big data is usually measured by 3V attribute:
- volume and
Velocity: The velocity referes to the speed at which the data is coming in for example the scientific experiments that they do at the atomic reactors where they do the collisiob of subatomic particles 40 terabytes of data could come in within one second, so that is a very high speed.
Volume: Volume is ofcourse a problem, the data keeps on getting accumulated and the file becomes too large to be handled by traditional systems. Facebook is generating 25 terabyte of data daily so just imagime the size of the files that are there since the beginning of time. In traditional systems, the data is structured and it is stores in well-planned tables each table have specific colmns and each column could accept values of specific data types.
Variety: For the case of big data, the third problem we mentioned id the variety when the big data comes in it may include items of variety of formats. It could have audio files, video files, unstructured data like text messages. So that becomes challenging sometimes for the traditional system to handle.
The explosion of Big Data is a very recent phenomena and companies have started to realize that they should capture all this data that is being produced, try to analyse it, try to get value out of it. These days the decission making is solely performed on structured data which is mostly stored in the applications like ERP and other related applications that are running in an enterprise. So most of the unstructured data gets wasted. It is not captured and even it is captured, it is not analyzed. If it is analyzed, the real value is not extracted because of the limitations.
Sources of Big Data
At a very high level, growing number of users applications, systems and sensors are producing large files. Some examples of data generation points: growing number of mobile devices, microphones, readers, scanners, science facilities, cameras, social media websites, programs/software etc where we may have video files, photos, text messages, books, audios, logs, email, transactions, images, click trails, documents, public records as a form of data.
- Airbus: Airbus generates 10TB data every 30 minutes, about 640 TB is generated in one flight.
- Smart Meters: Smart meters in houses read the usage every 15 minutes, about 350 billion transactions are recorded in a year. In 2009, there were 76 million smart meters, but by 2014, there is 200 million smart meters.
- Smart Phones: 5 billion camara phones are there in the world wide. Most of them have location awareness(GPS). 22% of them are smartphones. The cellphones and smartphones are major pjayes in creating large volume of data. Everyday we upload 55 million pictures.
- Internet Users: In 2015, the International Telecommunication Union estimated about 3.2 billion people, or almost half of the world's population, would be online by the end of the year. Cisco estimates in 2016, global IP traffic was 1.2 ZB per year or 96 EB (one billion Gigabytes [GB]) per month.
- Blogs: There are about 200 million blogs entries on the web.
- Emails: 300 billion emails are sent every day.
- Facebook: Facebook generates 25 TB data everyday.
- Twitter: It generated 12 TB data daily, 200 million users generating 230 million tweets daily.
- Youtube: 2.9 billion video are watched on youtube per month.
- Experiments: Scientific facilities like atomic reactor where the break subatomic particles could generate upto 40 terabytes per second that is a lot of data.
What we will see in coming years: companies will try to capture 90% of unstructured data that is being wasted.They will not only try to capture it, they will try to analyse and understand it, try to get meaningful information out of it to create an edge over competition. Governments want to use big data to forcast events like unfortunate events, civil unrest spread of diseases so on sothat they can take pro active actions.
Big Data influence
Influence on health sector:
The big data analytics tools and repositories remove the hard thinking and generate reliable and calculative insights out of huge volumes of data within a matter of seconds. The big data revolution is bringing up sophisticated methods of consolidating information from tons of sources. The focus is on providing the most relevant and updated information to doctors and medical practitioners in real time while they are consulting their patients. Up till now the collection of data is limited to the major available resources in the healthcare sector. However, with the advent of smartphone apps and wearables, data is now everywhere. And this allows practitioners to know patients’ health conditions in a more precise manner. Apps that act like pedometers to measure your steps, the calorie counter for your diet, the app for monitoring and recording heart rate, blood pressure and blood sugar levels, and wearable devices like Fitbit, Jawbone etc. are all sources of data nowadays. In the near future, the patient will share this data with the doctor who can utilize it as a diagnostic toolbox to provide better treatment in less time.
Shopping habit analysis: Understanding shopper behavior is essential for business success. Big data is an essential component of the process, and provides information on trends, spikes in demands and customer preferences. Business owners can use that data to make sure most popular products are available and being marketed. If customers visit your site to search for products you don’t offer, big data is how you will learn about those searches, helping you seize new opportunities. In 2018, big data analysis will continue illuminating important shopper behaviors and patterns, such as popular shopping times and spikes in product searches.
Better consumer service: Statistics regarding unhappy customers and poor customer service are alarming. For instance, 91 percent of unhappy customers will not willingly do business with a company if they’ve had a poor customer service experience. Focusing on customer service is crucial to the success of all e-commerce businesses.
Understanding your shoppers is important, but even more important is making it easy for customers to contact your business, resolve issues or find answers to their questions. Big data provides the metrics needed to see how quickly customers are able to complete these tasks.
Through this year, expect big data to continue offering ways to track customer service experiences, but also to add even more predictive monitoring. This will help online companies identify potential problems and resolve them before a customer even gets involved.
Easier and more secure online payments:
Big data has a significant role in making online payments easier and more secure. Here are eight different ways big data is changing the e-commerce payment industry in 2018:
- Big data integrates all different payment functions into one centralized platform. Not only does it help with ease of use for customers, it also helps reduce fraud risks.
- The advanced analytics offered by big data are powerful and intuitive enough to discover fraud in real time and to provide proactive solutions for identifying risks.
- Big data can detect payment money laundering transactions that appear as legitimate payments.
- Recently, payment providers have started realizing the potential of monetizing merchant analytics. Payment providers can help different merchant retailers understand their customers better.
- Data analytics allows e-commerce businesses to cross sell and upsell.
- Push notification generated sales act as an effective means to validate customer data.
The big data revolution happening in and around 21st century has found a resonance with banking firms, considering the valuable data they've been storing since many decades. This data has now unlocked secrets of money movements, helped prevent major disasters and thefts and understand consumer behaviour. Banks reap the most benefits from big data as they now can extract good information quickly and easily from their data and convert it into meaningful benefits for themselves and their customers. Banks internationally are beginning to harness the power of data in order to derive utility across various spheres of their functioning, ranging from sentiment analysis, product cross selling, regulatory compliances management, reputational risk management, financial crime management and much more.
Help in learning spending patterns of customers: Using the help of customers’ payment data, the banks or financial companies can understand the spending patterns of customers. This will help the companies to identify when potential customers may require certain financial services. With the help of Big Data analytics, one can see different spending patterns differentiated according to demographics, average income, etc.
Customer spending patterns can also help in identifying the customers who are your most valuable people, the ones who spend the most money. Through this data, you can provide financial offers that can help them feel valued as your customer. Also, Big Data can be used to detect high-risk spending patterns that affect customers negatively in the long run and safeguard them from a lot of pain.
Prevention and detection of fraud: Banks have all kinds of customers with different financial requirements from the banks and different financial behaviours. With the help of Big Data, banks can categorise their clients based on these parameters along with the parameters already mentioned in the first point. Segmentation will benefit the banks when it comes to marketing promotions to target audiences according to the services that will be beneficial to them. This will help in building better customer relationships and not spam their inboxes.
Risk management: Risks in banking and financial industry may come in any form – fraudulent activities, bad loans or failed investments. The early detection of these risks can help in preventing huge losses which otherwise may be incurred. Big Data can help in analysis problems on large scale and with the help of analytics divide them into smaller ones, which are manageable.
We have already seen some of the big financial institutions venture into Big Data and find it positively impact their business activities. In the long run, Big Data will be beneficial to both the banking or financial institutions and the customers as well. So, its adaptation can be seen in more and more companies in the near future. If you are looking to make data analysis work wonders in the banking industry, you can go for a Big Data Course. Big Data is the future.
While we’re not quite at the level of predictive policing seen in Hollywood’s Minority Report, big data does help police each day determine where and when crime is likely to occur next. This helps your local police force decide where to station officers at any given time. Thanks to big data, our streets are a bit safer with less wasted resources.