Pig is an open source high-level data flow system. It allows us to write complex MapReduce transformations using a simple scripting language (e.g. 10 lines of Pig script is equal to 200 lines oh Java). This language is called Pig Latin.
Pig Latin defines a set of transformations on a data set such as aggregate, join and sort. The statement written in Pig script is then translated by Pig into MapReduce so that it can be executed within Hadoop.
The following session explains in detail about:
- Need of Pig
- Why Pig when MR is already there?
- Use Cases of Pig
- Where Pig is not useful?