2. Some information about Big Data current situation According to Intel in January 9/2013, the world today is creating 1 petabyte of data every 11 seconds and it is equivalent to a 13-year HD video . The company itself, now also owns his own Big Data, such as online sales site eBay, the use of two data centers with up to 40 petabytes of capacity to contain the query, search , recommended to customers as well as information about its products. the online retailer Amazon.com, they must handle millions of daily activities as well as requests from about half a million sales partners. Amazon uses a Linux system, and in 2005, they used to own three major Linux database in the world with a capacity of 7,8TB, 18,5TB and 24,7TB. Similarly, Facebook also must manage 50 billion photos uploaded from users, YouTube or Google, they must save every query and turn a user's video and other types of relevant information. Now the SAS Group, we have a few figures interesting about big data is as follows: The RFID system (a type of short range connectivity, like NFC, but have a range of things beyond and also used in hotel key cards), generate large amounts of data over 1,000 times the traditional cauldron code only within 4 hours of the day "Black Friday" in 2012, Walmart stores had to handle more than 10 million cash transactions, ie 5,000 accounts per second interface. Service delivery UPS received approximately 39.5 million requests from our customers every day Service processed more than 172.8 million VISA card transaction within one day only on Twitter has 500 million daily new tweet, Facebook, you 1 , 15 members who create a huge tangle of text data, files, videos ... 3. Technology used in Big Data Big Data is growing demand was so great that Software AG, Oracle, IBM, Microsoft, SAP, EMC, HP, and Dell has spent more than $ 15 billion for the company that specializes in management and analysis data. In 2010, the Big Data industry worth more than $ 100 billion and growing at a rate of 10% per year, twice as fast as total software industry in general. As noted above, the need for Big Data exploiting information technology very special due to the enormous and its complexity. In 2011, the group proposed the McKinsey analysis technology can be used with Big Data includes crowsourcing (leverage resources from multiple computing devices to work together on global data processing), genetic algorithms and genetic, machine learning methods (the only systems capable of learning from data, a branch of artificial intelligence), natural language processing (like Siri or Google Voice Search, but higher more level), signal processing, simulation, time series analysis, modeling, combining powerful servers together .... these techniques are complex, so we do not talk about them deeply . in addition, the database supports parallel data processing, application works on search activity, form discrete file system, the cloud computing system (including applications and resources calculation and storage space) and the Internet itself are also an effective tool to serve the research and extract information from "big data". Currently there are some database in the form of relations (tables) capable of containing petabytes of data, they can also download, manage, backup and optimize Big Data using anymore. People do Big data works with often feel uncomfortable with the data storage system as slow, so they prefer the storage media that can mount directly to the computer (as well as a hard drive in your computer we do). It may be a SSD drive to the SATA disk storage in a large grid. These people are looking at NAS drive or network storage SAN systems with the perspective that these things too complex, expensive and slow. The aforementioned properties are not suitable for the system used to analyze Big Data targeting high-performance capital, leverage common infrastructure and low cost. In addition, the analysis of Big Data will also need to be applied in real time or near real time, so the delay should be eliminated whenever and wherever possible.
đang được dịch, vui lòng đợi..