Friday, May 13, 2016

Hadoop : PIG

What is Apache Pig?
 Apache Pig is an abstraction over MapReduce. It is a tool/platform which is used to analyze larger sets ofdata representing them as data flows. Pig is generally used with Hadoop; we can perform all the data manipulation operations in Hadoop using Apache Pig.
To write data analysis programs, Pig provides a high­level language known as Pig Latin. This language provides various operators using which programmers can develop their own functions for reading, writing, and processing data.
To analyze data using Apache Pig, programmers need to write scripts using Pig Latin language. 
All scripts are internally converted to Map and Reduce tasks. Apache Pig has a component known as Pig Engine that accepts the Pig Latin scripts as input and converts those scripts into MapReduce jobs.
Why Do We Need Apache Pig? 

Programmers who are not so good at Java normally used to struggle working with Hadoop, especially while performing any MapReduce tasks. Apache Pig is a boon for all such programmers. 
        Using Pig Latin, programmers can perform MapReduce tasks easily without having to type complex codes in Java. 
        Apache Pig uses multi-query approach, thereby reducing the length of codes.
For example, an operation that would require you to type 200 lines of code (LoC) in Java can be easily done by typing as less as just 10 LoC in Apache Pig. Ultimately Apache Pig reduces the development time by almost 16 times.
        Pig Latin is SQL-like language and it is easy to learn Apache Pig when you are familiar with SQL.  
        Apache Pig provides many built-in operators to support data operations like joins, filters, ordering, etc. In addition, it also provides nested data types like tuples, bags, and maps that are missing from MapReduce.

Features of Pig


Apache Pig comes with the following features:
        Rich set of operators:  It provides many operators to perform operations like join, sort, filer, etc.  
        Ease of programming: Pig Latin is similar to SQL and it is easy to write a Pig script if you are good at SQL. 
        Optimization opportunities: The tasks in Apache Pig optimize their execution automatically, so the programmers need to focus only on semantics of the language.
        Extensibility: Using the existing operators, users can develop their own functions to read, process, and write data.
        UDF’s: Pig provides the facility to create User-defined Functions in other programming languages such as Java and invoke or embed them in Pig Scripts. 
        Handles all kinds of data: Apache Pig analyses all kinds of data, both structured as well as unstructured. It stores the results in HDFS.

Apache Pig Vs MapReduce


Listed below are the major differences between Apache Pig and MapReduce.


Apache Pig

MapReduce
Apache Pig is a data flow language.

MapReduce is a data processing paradigm. 

It is a high level language.

MapReduce is low level and rigid.

Performing a Join operation in Apache Pig is pretty simple.

It is quite difficult in MapReduce to perform a Join operation between datasets. 


Any novice programmer with a basic knowledge of SQL can work conveniently with Apache Pig. 

Exposure to Java is must to work with MapReduce.  


Apache Pig uses multi-query approach, thereby reducing the length of the codes to a great extent. 

MapReduce will require almost 20 times more the number of lines to perform the same task. 


There is no need for compilation. On execution, every Apache Pig operator is converted internally into a MapReduce job.

MapReduce jobs have a long compilation process.


 Apache Pig Vs SQL 


Listed below are the major differences between Apache Pig and SQL.

Pig

SQL


Pig Latin is a procedural language.

SQL is a declarative language.

In Apache Pig, schema is optional. We can store data without designing a schema
(values are stored as $01, $02 etc.)

Schema is mandatory in SQL.
The data model in Apache Pig is nested relational.

The data model used in SQL is flat relational.

Apache Pig provides limited opportunity for Query optimization.

There is more opportunity for query optimization in SQL.


In addition to above differences, Apache Pig Latin; 
        Allows splits in the pipeline.
        Allows developers to store data anywhere in the pipeline.
        Declares execution plans.
        Provides operators to perform ETL (Extract, Transform, and Load) functions. 

 Apache Pig Vs Hive


Both Apache Pig and Hive are used to create MapReduce jobs. And in some cases, Hive operates on HDFS in a similar way Apache Pig does. In the following table, we have listed a few significant points that set Apache Pig apart from Hive. 


Apache Pig

Hive
Apache Pig uses a language called Pig Latin. It was originally created at Yahoo.

Hive uses a language called HiveQL. It was originally created at Facebook.


Pig Latin is a data flow language.

HiveQL is a query processing language.

Pig Latin is a procedural language and it fits in pipeline paradigm.

HiveQL is a declarative language.

Apache Pig can handle structured, unstructured, and semi-structured data.

Hive is mostly for structured data.

Applications of Apache Pig


Apache Pig is generally used by data scientists for performing tasks involving ad-hoc processing and quick prototyping. Apache Pig is used;
        To process huge data sources such as web logs.
        To perform data processing for search platforms.
        To process time sensitive data loads.

Apache Pig Architecture
The language used to analyze data in Hadoop using Pig is known as Pig Latin. It is a high level data processing language which provides a rich set of data types and operators to perform various operations on the data.
To perform a particular task Programmers using Pig, programmers need to write a Pig script using the Pig Latin language, and execute them using any of the execution mechanisms (Grunt Shell, UDFs, Embedded). After execution, these scripts will go through a series of transformations applied by the Pig Framework, to produce the desired output.
Internally, Apache Pig converts these scripts into a series of MapReduce jobs, and thus, it makes the programmer’s job easy. The architecture of Apache Pig is shown below.

Apache Pig – Components


As shown in the figure, there are various components in the Apache Pig framework. Let us take a look at the major components. 

Parser

Initially the Pig Scripts are handled by the Parser. It checks the syntax of the script, does type checking, and other miscellaneous checks. The output of the parser will be a DAG (directed acyclic graph), which represents the Pig Latin statements and logical operators.
In the DAG, the logical operators of the script are represented as the nodes and the data flows are represented as edges.

Optimizer

The logical plan (DAG) is passed to the logical optimizer, which carries out the logical optimizations such as projection and pushdown.
Compiler 
The compiler compiles the optimized logical plan into a series of MapReduce jobs.  

Execution engine

Finally the MapReduce jobs are submitted to Hadoop in a sorted order. Finally, these MapReduce jobs are executed on Hadoop producing the desired results.

Pig Latin – Data Model


The data model of Pig Latin is fully nested and it allows complex non-atomic datatypes such as map and tuple. Given below is the diagrammatical representation of Pig Latin’s data model.

Atom  

Any single value in Pig Latin, irrespective of their data, type is known as an Atom. It is stored as string and can be used as string and number. int, long, float, double, chararray, and bytearray are the atomic values of Pig.
A piece of data or a simple atomic value is known as a field.
Ex: ‘001’ or ‘rajiv’ or ‘Hyderabad’

Tuple  

A record that is formed by an ordered set of fields is known as a tuple, the fields can be of any type. A tuple is similar to a row in a table of RDBMS.
Ex: (001, rajiv, hyd)

Bag

A bag is an unordered set of tuples. In other words, a collection of tuples (non-unique) is known as a bag. Each tuple can have any number of fields (flexible schema). A bag is represented by ‘{}’. It is similar to a table in RDBMS, but unlike a table in RDBMS, it is not necessary that every tuple contain the same number of fields or that the fields in the same position (column) have the same type.
Ex: cat emp
ravi,m,10000
 rani,f,40000
 ram,m,50000
 vani,f,60000
 mani,m,90000
bags are two types:-
   i) Outerbag
   ii) Innerbag
Outerbag:-
collection all tuples of a dataset is called outerbag.
outer bag is referenced by "Relation name" simply called as "Alias of Relation"

Relation 

A relation is a bag of tuples.  The relations in Pig Latin are unordered (there is no guarantee that tuples are processed in any particular order).      
  emp ---> relation
___________________
 (ravi,m,10000)
 (rani,f,40000)
 (ram,m,50000)
 (vani,f,60000)
 (mani,m,90000)
____________________
Innerbag:-
  A bag placed as a field is called inner bag
 grp = group emp by sex;



 grp
___________________________
 group:chararray    ,      emp:bag
________________________________
(f,{(rani,f,40000),(vani,f,60000)})
(m,{(ravi,m,10000),(ram,m,50000),(mani,m,90000)})
{(rani,f,40000),(vani,f,60000)}---> innner bag.
when you group data, you get inner bags.

Pig has two start-up modes:
1.    Local mode- pig -x local
2.    Hdfs mode- pig -x mapreduce

Pig Latin – Data Model 
As discussed in the above, the data model of Pig is fully nested. A Relation is the outermost structure of the Pig Latin data model. And it is a bag where -
        A bag is a collection of tuples.
        A tuple is an ordered set of fields.
        A field is a piece of data.

Pig Latin – Statemets 


While processing data using Pig Latin, statements are the basic constructs. 
        These statements work with relations. They include expressions and schemas.

        Every statement ends with a semicolon (;).

        We will perform various operations using operators provided by Pig Latin, through statements.

        Except LOAD and STORE, while performing all other operations, Pig Latin statements take a relation as input and produce another relation as output.

As soon as you enter a Load statement in the Grunt shell, its semantic checking will be carried out. To see the contents of the schema, you need to use the Dump operator. Only after performing the dump operation, the MapReduce job for loading the data into the file system will be carried out.

 

Example

Given below is a Pig Latin statement, which loads data to Apache Pig.
Student_data = LOAD 'student_data.txt' USING PigStorage(',')as ( id:int, firstname:chararray, lastname:chararray, phone:chararray, city:chararray );

Pig Latin – Data types


Data Type


Description and Example

int

Represents a signed 32-bit integer.
Example: 8

long
Represents a signed 64-bit integer.
Example: 5L

float
Represents a signed 32-bit floating point.
Example: 5.5F

double
Represents a 64-bit floating point.
Example: 10.5

chararray
Represents a character array (string) in Unicode UTF-8 format. Example: ‘tutorials point’ 

Bytearray

Represents a Byte array (blob).

Boolean

Represents a Boolean value.
Example: true/ false.

Datetime

Represents a date-time.
Example:1970-01-01T00:00:00.000+00:00

Biginteger

Represents a Java BigInteger.
Example: 60708090709

Bigdecimal

Represents a  Java BigDecimal
Example: 185.98376256272893883


Complex Types

Tuple

A tuple is an ordered set of fields.
Example: (raja, 30)

Bag

A bag is a collection of tuples.
Example: {(raju,30),(Mohhammad,45)}

Map

A Map is a set of key-value pairs.
Example:[ ‘name’#’Raju’, ‘age’#30]


Pig Latin – Arithmetic Operators


The following table describes the arithmetic operators of Pig Latin. Suppose a=10 and b=20.
Operator
Description
Example
+
Addition - Adds values on either side of the operator
a + b will give 30
-
Subtraction - Subtracts right hand operand from left hand operand
a - b will give -10
*
Multiplication - Multiplies values on either side of the operator
a * b will give 200
/
Division – Divides left hand operand by right hand operand
b / a will give 2
%
Modulus – Divides left hand operand by right hand operand and returns remainder
b % a will give 0
? :
Bincond – Evaluates the Boolean operators. It has three operands as shown below.
variable x = (expression) ? value1 if true : value2 if false.
b = (a == 1)? 20: 30; if a=1 the value of b is 20. if a!=1 the value of b is 30.
CASE
WHEN
THEN
ELSE END
Case - The case operator is equivalent to nested bincond operator. 
CASE f2 % 2
    WHEN 0 THEN 'even'
    WHEN 1 THEN 'odd'
  END


Pig Latin – Comparison Operators




Operator
Description
Example
==
Equal – Checks if the values of two operands are equal or not; if yes, then the condition becomes true.
(a = b) is not true.
!=
Not Equal – Checks if the values of two operands are equal or not. If the values are not equal, then condition becomes true.
(a != b) is true.
>
Greater than – Checks if the value of the left operand is greater than the value of the right operand. If yes, then the condition becomes true.
(a > b) is not true.
<
Less than – Checks if the value of the left operand is less than the value of the right operand. If yes, then the condition becomes true.
(a < b) is true.
>=
Greater than or equal to – Checks if the value of the left operand is greater than or equal to the value of the right operand. If yes, then the condition becomes true.
(a >= b) is not true.
<=
Less than or equal to – Checks if the value of the left operand is less than or equal to the value of the right operand. If yes, then the condition becomes true.
(a <= b) is true.
matches
Pattern matching – Checks whether the string in the left-hand side matches with the constant in the right-hand side.
f1 matches '.*tutorial.*'





Pig Latin – Relational Operations

The following table describes the relational operators of Pig Latin.

Operator
                               
Description

           


Loading and Storing


LOAD


To Load the data from the file system (local/HDFS) into a relation.


STORE

To save a relation to the file system (local/HDFS).



Filtering


FILTER

To remove unwanted rows from a relation.
           

DISTINCT

To remove duplicate rows from a relation.

FOREACH… GENERATE:

To generate data transformations based on columns of data.


STREAM

To transform a relation using an external program.



Grouping and Joining


JOIN


To join two or more relations.


COGROUP


To group the data in two or more relations.

GROUP


To group the data in a single relation.

CROSS


To create the cross product of two or more relations.


Sorting


ORDER


To arrange a relation in a sorted order based on one or more fields (ascending or descending).



LIMIT

To get a limited number of tuples from a relation. 


Combining and Splitting



UNION

To combine two or more relations into a single relation.

SPLIT


To split a single relation into two or more relations.


Diagnostic Operators


DUMP

 
To print the contents of a relation on the console.





To describe the schema of a relation.




To view the logical, physical, or MapReduce execution plans to compute a relation.  



To view the step-by-step execution of a series of statements.


The Load Operator 

You can load data into Apache Pig from the file system (HDFS/ Local) using LOAD operator of Pig Latin.  

Syntax

The load statement consists of two parts divided by the “=” operator. On the left-hand side, we need to mention the name of the relation where we want to store the data, and on the right-hand side, we have to define how we store the data. Given below is the syntax of the Load operator.
Ex: cat student_data.txt
001,Rajiv,Reddy,9848022337,Hyderabad
002,siddarth,Battacharya,9848022338,Kolkata
003,Rajesh,Khanna,9848022339,Delhi
004,Preethi,Agarwal,9848022330,Pune
005,Trupthi,Mohanthy,9848022336,Bhuwaneshwar 006,Archana,Mishra,9848022335,Chennai
Relation_name = LOAD 'Input file path' USING function as schema;
Schema: (column1 : data type, column2 : data type, column3 : data type);
1.       PigStorage()—TextInputFormat
2.       BinStoarge()—SequenceInputFormat(BinaryFiles)
Default storage method is PigStorage(). Default delimiter is ‘\t’.

grunt> student = LOAD 'hdfs://localhost:9000/pig_data/student_data.txt' USING PigStorage(',')as ( id:int, firstname:chararray, lastname:chararray, phone:chararray, city:chararray ); 

grunt>emp = load ‘emp’ using PigStorage (‘,’) as (ecode:int, ename:chararray, esal:int, sex:chararray, dno:int);
Store operator
This chapter explains how to store data in Apache Pig using the Store operator.

Syntax


STORE Relation_name INTO ' required_directory_path ' [USING function];
Ex: cat  student_data.txt
001,Rajiv,Reddy,9848022337,Hyderabad
002,siddarth,Battacharya,9848022338,Kolkata
003,Rajesh,Khanna,9848022339,Delhi
004,Preethi,Agarwal,9848022330,Pune
005,Trupthi,Mohanthy,9848022336,Bhuwaneshwar 006,Archana,Mishra,9848022335,Chennai.
grunt> student = LOAD 'hdfs://localhost:9000/pig_data/student_data.txt' USING PigStorage(',') as ( id:int, firstname:chararray, lastname:chararray, phone:chararray, city:chararray ); 
let us store the relation in the HDFS directory pig_Outputas shown below.
grunt> STORE student INTO 'pig_Output/' USING PigStorage (',');

Output

After executing the store statement, you will get the following output. A directory is created with the specified name and the data will be stored in it.
--------------------------------------------------------------------------------------------------
hdfs dfs -ls 'pig_Output/' Found 2 items
rw-r--r-   1 Hadoop supergroup         0 2015-10-05 13:03 pig_Output/_SUCCESS

rw-r--r-   1 Hadoop supergroup        224 2015-10-05 13:03 pig_Output/part-m-00000


You can observe that two files were created after executing the store statement. 
Using cat command, list the contents of the file named part-m-00000 as shown below.
$ hdfs dfs -cat 'pig_Output/part-m-00000'
1,Rajiv,Reddy,9848022337,Hyderabad
2,siddarth,Battacharya,9848022338,Kolkata
3,Rajesh,Khanna,9848022339,Delhi
4,Preethi,Agarwal,9848022330,Pune
5,Trupthi,Mohanthy,9848022336,Bhuwaneshwar
6,Archana,Mishra,9848022335,Chennai

Diagnostic Operators
Diagnostic Operators. Pig Latin provides four different types of diagnostic operators:

        Dump operator
        Describe operator
        Explanation operator
        Illustration operator

Dump Operator


The Dump operator is used to run the Pig Latin statements and display the results on the screen. It is generally used for debugging Purpose. 

Syntax

 grunt>Dump Relation_Name
Example:
we have a file student_data.txt in HDFS with the following content.
1,Rajiv,Reddy,9848022337,Hyderabad
2,siddarth,Battacharya,9848022338,Kolkata
3,Rajesh,Khanna,9848022339,Delhi
4,Preethi,Agarwal,9848022330,Pune
5,Trupthi,Mohanthy,9848022336,Bhuwaneshwar
6,Archana,Mishra,9848022335,Chennai.

grunt> student = LOAD 'hdfs://localhost:9000/pig_data/student_data.txt' USING PigStorage(',') as ( id:int, firstname:chararray, lastname:chararray, phone:chararray, city:chararray ); 

Output

Once you execute the above Pig Latin statement, it will start a MapReduce job to read data from HDFS. It will produce the following output. Output will be terminal.

grunt>Dump student
1,Rajiv,Reddy,9848022337,Hyderabad
2,siddarth,Battacharya,9848022338,Kolkata
3,Rajesh,Khanna,9848022339,Delhi
4,Preethi,Agarwal,9848022330,Pune
5,Trupthi,Mohanthy,9848022336,Bhuwaneshwar
6,Archana,Mishra,9848022335,Chennai.

Describe Operator
The describe operator is used to view the schema of a relation.
Syntax:
grunt>describe Relation_Name
grunt>describe student;

 

Output

Once you execute the above Pig Latin statement, it will produce the following output.
grunt>  student: { id: int,firstname: chararray,lastname: chararray,phone:
chararray,city: chararray }

Explain Operator
The explain operator is used to display the logical, physical, and MapReduce execution plans of a relation. 

Syntax

Given below is the syntax of the explain operator.
grunt>  explain Relation_name;

Example

Assume we have a file student_data.txt in HDFS.
grunt>  explain student;
Illustrate operator
The illustrate operator gives you the step-by-step execution of a sequence of statements.
Syntax:
grunt> illustrate Relation_name;
Example
Assume we have a file student_data.txt in HDFS.
grunt> student = LOAD 'hdfs://localhost:9000/pig_data/student_data.txt' USING PigStorage(',') as ( id:int, firstname:chararray, lastname:chararray, phone:chararray, city:chararray ); 
grunt> illustrate student;
Output
On executing the above statement, you will get the following output.
grunt> illustrate student;

INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$M ap - Aliases being processed per job phase (AliasName[line,offset]): M: student[1,10] C:  R: 
------------------------------------------------------------------------------- | student  | id:int | firstname:chararray| lastname:chararray| phone:chararray    | city:chararray    | 
-------------------------------------------------------------------------------
|           | 002    | siddarth  | Battacharya | 9848022338  | Kolkata  | 
-------------------------------------------------------------------------------
Group Operator
The group operator is used to group the data in one or more relations. It collects the data having the same key.

Syntax

Given below is the syntax of the group operator.
Group_data = GROUP Relation_name BY age;

Example

Assume that we have a file named student_details.txt in the HDFS directory /pig_data/ as shown below. student_details.txt
1,Rajiv,Reddy,21,9848022337,Hyderabad
2,siddarth,Battacharya,22,9848022338,Kolkata
3,Rajesh,Khanna,22,9848022339,Delhi
4,Preethi,Agarwal,21,9848022330,Pune
5,Trupthi,Mohanthy,23,9848022336,Bhuwaneshwar
6,Archana,Mishra,23,9848022335,Chennai
7,Komal,Nayak,24,9848022334,trivendram
8,Bharathi,Nambiayar,24,9848022333,Chennai 
And we have loaded this file into Apache Pig with the schema name student_details as shown below.
student_details = LOAD 'hdfs://localhost:9000/pig_data/student_details.txt' USING PigStorage(',')as (id:int, firstname:chararray, lastname:chararray, age:int, phone:chararray, city:chararray); 

grunt> group_data = GROUP student_details by age;

grunt> Dump group_data;

Output

Then you will get output displaying the contents of the relation named groyp_data as shown below. Here you can observe that the resulting schema has two columns –
        One is age, by which we have grouped the relation.

        The other is a bag, which contains the group of tuples, student records with the respective age.

(21,{(4,Preethi,Agarwal,21,9848022330,Pune),(1,Rajiv,Reddy,21,9848022337,Hydera bad)})
(22,{(3,Rajesh,Khanna,22,9848022339,Delhi),(2,siddarth,Battacharya,22,984802233 8,Kolkata)})
(23,{(6,Archana,Mishra,23,9848022335,Chennai),(5,Trupthi,Mohanthy,23,9848022336
,Bhuwaneshwar)})
(24,{(8,Bharathi,Nambiayar,24,9848022333,Chennai),(7,Komal,Nayak,24,9848022334, trivendram)})
You can see the schema of the table after grouping the data using the describe command as shown below.
grunt> Describe group_data;
group_data:{group:int,student_details:
{(id:int,firstname:chararray,lastname:chararray,age:int,phone:chararray,city:chararray)}}
In the same way, you can get the sample illustration of the schema using the illustrate command as shown below.
$ Illustrate group_data;
It will produce the following output:
-------------------------------------------------------------------------------
|group_data|group:int||student_details:bag{:tuple(id:int,firstname:chararray,lastname:chararray,age:i nt,phone:chararray,city:chararray)}|
|  |  21| { 4, Preethi, Agarwal, 21, 9848022330, Pune), (1, Rajiv, Reddy, 21, 9848022337, Hyderabad)}|
|  |  22 | {(2,siddarth,Battacharya,22,9848022338,Kolkata),
(003,Rajesh,Khanna,22,9848022339,Delhi)}|
-------------------------------------------------------------------------------

Grouping by Multiple Columns


Let us group the relation by age and city as shown below.
grunt> group_multiple = GROUP student_details by (age, city);

You can verify the content of the schema named group_multiple using the Dump operator as shown below.
grunt> Dump group_multiple;
      
((21,Pune),{(4,Preethi,Agarwal,21,9848022330,Pune)})
((21,Hyderabad),{(1,Rajiv,Reddy,21,9848022337,Hyderabad)})
((22,Delhi),{(3,Rajesh,Khanna,22,9848022339,Delhi)})
((22,Kolkata),{(2,siddarth,Battacharya,22,9848022338,Kolkata)})
((23,Chennai),{(6,Archana,Mishra,23,9848022335,Chennai)})
((23,Bhuwaneshwar),{(5,Trupthi,Mohanthy,23,9848022336,Bhuwaneshwar)})
((24,Chennai),{(8,Bharathi,Nambiayar,24,9848022333,Chennai)}) ((24,trivendram),{(7,Komal,Nayak,24,9848022334,trivendram)})

Group All

You can group a relation by all the columns as shown below.
grunt> group_all = GROUP student_details All;
Now, verify the content of the schema group_all as shown below.
grunt> Dump group_all; 
      
(all,{(8,Bharathi,Nambiayar,24,9848022333,Chennai),(7,Komal,Nayak,24,9848022334 ,trivendram),
(6,Archana,Mishra,23,9848022335,Chennai),(5,Trupthi,Mohanthy,23,9848022336,Bhuw aneshwar),
(4,Preethi,Agarwal,21,9848022330,Pune),(3,Rajesh,Khanna,22,9848022339,Delhi),
(2,siddarth,Battacharya,22,9848022338,Kolkata),(1,Rajiv,Reddy,21,9848022337,Hyd erabad)})
Cogroup Operator
Cogrop is used for to group Two or more relations
Assume that we have two files namely student_details.txt and employee_details.txt in the HDFS directory /pig_data/ as shown below. student_details.txt
001,Rajiv,Reddy,21,9848022337,Hyderabad
002,siddarth,Battacharya,22,9848022338,Kolkata
003,Rajesh,Khanna,22,9848022339,Delhi
004,Preethi,Agarwal,21,9848022330,Pune
005,Trupthi,Mohanthy,23,9848022336,Bhuwaneshwar
006,Archana,Mishra,23,9848022335,Chennai
007,Komal,Nayak,24,9848022334,trivendram
008,Bharathi,Nambiayar,24,9848022333,Chennai 
employee_details.txt
001,Robin,22,newyork
002,BOB,23,Kolkata
003,Maya,23,Tokyo
004,Sara,25,London
005,David,23,Bhuwaneshwar
006,Maggy,22,Chennai
And we have loaded these files into Pig with the schema names student_details and employee_details respectively, as shown below.
student_details = LOAD 'hdfs://localhost:9000/pig_data/student_details.txt' USING PigStorage(',')as (id:int, firstname:chararray, lastname:chararray, age:int, phone:chararray, city:chararray);
      
employee_details = LOAD 'hdfs://localhost:9000/pig_data/employee_details.txt' USING PigStorage(',')as (id:int, name:chararray, age:int, city:chararray); Now, let us group the records/tuples of the relations student_details and employee_details with the key age, as shown below.
grunt> cogroup_data = COGROUP student_details by age, employee_details by age;

Output :


Dump cogroup_data;

21,{(4,Preethi,Agarwal,21,9848022330,Pune),
(1,Rajiv,Reddy,21,9848022337,Hyderabad)},
   {    })

(22,{ (3,Rajesh,Khanna,22,9848022339,Delhi),
(2,siddarth,Battacharya,22,9848022338,Kolkata) }, 
      { (6,Maggy,22,Chennai),(1,Robin,22,newyork) })

(23,{(6,Archana,Mishra,23,9848022335,Chennai),(5,Trupthi,Mohanthy,23,9848022336 ,Bhuwaneshwar)},
     {(5,David,23,Bhuwaneshwar),(3,Maya,23,Tokyo),(2,BOB,23,Kolkata)})
 
(24,{(8,Bharathi,Nambiayar,24,9848022333,Chennai),(7,Komal,Nayak,24,9848022334, trivendram)},
       { })

(25,{   },
      {(4,Sara,25,London)})

The cogroup operator groups the tuples from each schema according to age where each group depicts a particular age value. 
For example, if we consider the 1st tuple of the result, it is grouped by age 21. And it contains two bags – 
        the first bag holds all the tuples from the first schema (student_details in this case) having age 21, and 

        the second bag contains all the tuples from the second schema
(employee_details in this case) having age 21.  
Join Operator
The join operator is used to combine records from two or more relations. While performing a join operation, we declare one (or a group of) tuple(s) from each relation, as keys. When these keys match, the two particular tuples are matched, else the records are dropped. Joins can be of the following types: 
        Inner-join
        Outer-join : left join, right join, and full join

customers.txt
1,Ramesh,32,Ahmedabad,2000.00
2,Khilan,25,Delhi,1500.00
3,kaushik,23,Kota,2000.00
4,Chaitali,25,Mumbai,6500.00
5,Hardik,27,Bhopal,8500.00 6,Komal,22,MP,4500.00
7,Muffy,24,Indore,10000.00
orders.txt
102,2009-10-08 00:00:00,3,3000
100,2009-10-08 00:00:00,3,1500
101,2009-11-20 00:00:00,2,1560
103,2008-05-20 00:00:00,4,2060
Load these two files into Pig with the schemas customers and orders..

Inner Join

An inner join returns rows when there is a match in both tables.

Syntax

Here is the syntax of performing inner join operation using the JOIN operator.
Relation3_name = JOIN Relation1_name BY key, Relation2_name BY key; 

Example

Let us perform inner join operation on the two relations customers and orders as shown below.
grunt> coustomer_orders = JOIN customers BY id, orders BY customer_id;                                                   

Output:

Verify the relation coustomer_orders using the DUMP operator as shown below.
Dump coustomer_orders;
You will get the following output that will the contents of the relation named coustomer_orders
(2,Khilan,25,Delhi,1500,101,2009-11-20 00:00:00,2,1560)
(3,kaushik,23,Kota,2000,100,2009-10-08 00:00:00,3,1500)
(3,kaushik,23,Kota,2000,102,2009-10-08 00:00:00,3,3000)
(4,Chaitali,25,Mumbai,6500,103,2008-05-20 00:00:00,4,2060)

Outer Join 

An outer join operation is carried out in three ways –
        Left outer join
        Right outer join
        Full outer join

Left Outer Join

The left outer Join operation returns all rows from the left table, even if there are no matches in the right relation.

Syntax

Given below is the syntax of performing left outer join operation using the JOIN operator.
Relation3_name = JOIN Relation1_name BY id LEFT OUTER, Relation2_name BY customer_id;

Example

Let us perform left outer join operation on the two relations customers and orders as shown below.
grunt> outer_left = JOIN customers BY id LEFT OUTER, orders BY customer_id; 

Output  

Verify the relation outer_left using the DUMP operator as shown below.
Dump outer_left;
It will produce the following output, displaying the contents of the relation outer_left
(1,Ramesh,32,Ahmedabad,2000,,,,)
(2,Khilan,25,Delhi,1500,101,2009-11-20 00:00:00,2,1560)
(3,kaushik,23,Kota,2000,100,2009-10-08 00:00:00,3,1500)
(3,kaushik,23,Kota,2000,102,2009-10-08 00:00:00,3,3000)
(4,Chaitali,25,Mumbai,6500,103,2008-05-20 00:00:00,4,2060)
(5,Hardik,27,Bhopal,8500,,,,)
(6,Komal,22,MP,4500,,,,)
(7,Muffy,24,Indore,10000,,,,) 

Right Outer Join

The right outer join operation returns all rows from the right table, even if there are no matches in the left table.

Syntax

Given below is the syntax of performing right outer join operation using the JOIN operator.
grunt> outer_right = JOIN customers BY id RIGHT, orders BY customer_id;                                                   

Example

Let us perform right outer join operation on the two relations customers and orders as shown below.
grunt> outer_right = JOIN customers BY id RIGHT, orders BY customer_id;       outer_right using the DUMP operator as shown below.
grunt> Dump outer_right;

Output

It will produce the following output, displaying the contents of the relation outer_right
(2,Khilan,25,Delhi,1500,101,2009-11-20 00:00:00,2,1560)
(3,kaushik,23,Kota,2000,100,2009-10-08 00:00:00,3,1500)
(3,kaushik,23,Kota,2000,102,2009-10-08 00:00:00,3,3000)
(4,Chaitali,25,Mumbai,6500,103,2008-05-20 00:00:00,4,2060)

Full Outer Join

The full outer join operation returns rows when there is a match in one of the relations.

Syntax

Given below is the syntax of performing full outer join using the JOIN operator.
grunt> outer_full = JOIN customers BY id FULL OUTER, orders BY customer_id; 

Example

Let us perform full outer join operation on the two relations customers and orders as shown below.
grunt> outer_full = JOIN customers BY id FULL OUTER, orders BY customer_id; 
Output
Verify the relation outer_full using the DUMP operator as shown below.
grunt> Dump outer_full;
It will produce the following output, displaying the contents of the relation outer_full
(1,Ramesh,32,Ahmedabad,2000,,,,)
(2,Khilan,25,Delhi,1500,101,2009-11-20 00:00:00,2,1560)
(3,kaushik,23,Kota,2000,100,2009-10-08 00:00:00,3,1500)
(3,kaushik,23,Kota,2000,102,2009-10-08 00:00:00,3,3000)
(4,Chaitali,25,Mumbai,6500,103,2008-05-20 00:00:00,4,2060)
(5,Hardik,27,Bhopal,8500,,,,)
(6,Komal,22,MP,4500,,,,)
(7,Muffy,24,Indore,10000,,,,)

Cross Operator
The cross operator computes the cross-product of two or more relations. This chapter explains with example how to use the cross operator in Pig Latin.

Syntax

Given below is the syntax of the Cross operator.
Relation3_name = CROSS Relation1_name, Relation2_name;

Example 

Assume that we have two files namely customers.txt and orders.txt in the /pig_data/ directory of HDFS as shown below. customers.txt
1,Ramesh,32,Ahmedabad,2000.00
2,Khilan,25,Delhi,1500.00
3,kaushik,23,Kota,2000.00
4,Chaitali,25,Mumbai,6500.00
5,Hardik,27,Bhopal,8500.00 6,Komal,22,MP,4500.00
7,Muffy,24,Indore,10000.00
orders.txt
102,2009-10-08 00:00:00,3,3000
100,2009-10-08 00:00:00,3,1500
101,2009-11-20 00:00:00,2,1560
103,2008-05-20 00:00:00,4,2060
And we have loaded these two files into Pig with the schemas customers and orders as shown below.
customers = LOAD 'pig_data/customers.txt' USING PigStorage(',')as (id:int, name:chararray, age:int, address:chararray, salary:int);

orders = LOAD 'pig_data/orders.txt' USING
PigStorage(',')as (oid:int, date:chararray, customer_id:int, amount:int);
Let us now get the cross-product of these two schemas using the cross operator on these two schemas as shown below.
cross_data = CROSS customers, orders;

Output

It will produce the following output, displaying the contents of the relation cross_data
(7,Muffy,24,Indore,10000,103,2008-05-20 00:00:00,4,2060)
(7,Muffy,24,Indore,10000,101,2009-11-20 00:00:00,2,1560)
(7,Muffy,24,Indore,10000,100,2009-10-08 00:00:00,3,1500)
(7,Muffy,24,Indore,10000,102,2009-10-08 00:00:00,3,3000)
(6,Komal,22,MP,4500,103,2008-05-20 00:00:00,4,2060)
(6,Komal,22,MP,4500,101,2009-11-20 00:00:00,2,1560)
(6,Komal,22,MP,4500,100,2009-10-08 00:00:00,3,1500)
(6,Komal,22,MP,4500,102,2009-10-08 00:00:00,3,3000)
(5,Hardik,27,Bhopal,8500,103,2008-05-20 00:00:00,4,2060)
(5,Hardik,27,Bhopal,8500,101,2009-11-20 00:00:00,2,1560)
(5,Hardik,27,Bhopal,8500,100,2009-10-08 00:00:00,3,1500)
(5,Hardik,27,Bhopal,8500,102,2009-10-08 00:00:00,3,3000)
(4,Chaitali,25,Mumbai,6500,103,2008-05-20 00:00:00,4,2060)
(4,Chaitali,25,Mumbai,6500,101,2009-20 00:00:00,4,2060)
(2,Khilan,25,Delhi,1500,101,2009-11-20 00:00:00,2,1560)
(2,Khilan,25,Delhi,1500,100,2009-10-08 00:00:00,3,1500)
(2,Khilan,25,Delhi,1500,102,2009-10-08 00:00:00,3,3000)
(1,Ramesh,32,Ahmedabad,2000,103,2008-05-20 00:00:00,4,2060)
(1,Ramesh,32,Ahmedabad,2000,101,2009-11-20 00:00:00,2,1560)
(1,Ramesh,32,Ahmedabad,2000,100,2009-10-08 00:00:00,3,1500)
(1,Ramesh,32,Ahmedabad,2000,102,2009-10-08 00:00:00,3,3000)
(4,Chaitali,25,Mumbai,6500,100,2009-10-08 00:00:00,3,1500)
(4,Chaitali,25,Mumbai,6500,102,2009-10-08 00:00:00,3,3000)
(3,kaushik,23,Kota,2000,103,2008-05-20 00:00:00,4,2060)
(3,kaushik,23,Kota,2000,101,2009-11-20 00:00:00,2,1560)
(3,kaushik,23,Kota,2000,100,2009-10-08 00:00:00,3,1500)
(3,kaushik,23,Kota,2000,102,2009-10-08 00:00:00,3,3000)
(2,Khilan,25,Delhi,1500,103,2008-05-20 00:00:00,4,2060)
(2,Khilan,25,Delhi,1500,101,2009-11-20 00:00:00,2,1560)
(2,Khilan,25,Delhi,1500,100,2009-10-08 00:00:00,3,1500) (2,Khilan,25,Delhi,1500,102,2009-10-08 00:00:00,3,3000)
(1,Ramesh,32,Ahmedabad,2000,103,2008-05-20 00:00:00,4,2060)
(1,Ramesh,32,Ahmedabad,2000,101,2009-11-20 00:00:00,2,1560)
(1,Ramesh,32,Ahmedabad,2000,100,2009-10-08 00:00:00,3,1500)
(1,Ramesh,32,Ahmedabad,2000,102,2009-10-08 00:00:00,3,3000)

Union Operator
The UNION operator of Pig Latin is used to merge the content of two relations. To perform UNION operation on two relations, their columns and domains must be identical

Syntax

Given below is the syntax of the UNION operator.
grunt> Relation_name3 = UNION Relation_name1, Relation_name2;

Example

Assume that we have two files namely student_data1.txt and student_data2.txt in the /pig_data/ directory of HDFS as shown below.
Student_data1.txt
001,Rajiv,Reddy,9848022337,Hyderabad
002,siddarth,Battacharya,9848022338,Kolkata
003,Rajesh,Khanna,9848022339,Delhi
004,Preethi,Agarwal,9848022330,Pune
005,Trupthi,Mohanthy,9848022336,Bhuwaneshwar 006,Archana,Mishra,9848022335,Chennai.
Student_data2.txt
7,Komal,Nayak,9848022334,trivendram.
8,Bharathi,Nambiayar,9848022333,Chennai.
And we have loaded these two files into Pig with the schemas student1 and student2 as shown below.
student1 = LOAD 'hdfs://localhost:9000/pig_data/student_data1.txt' USING PigStorage(',') as (id:int, firstname:chararray, lastname:chararray, phone:chararray, city:chararray);

student2 = LOAD 'hdfs://localhost:9000/pig_data/student_data2.txt' USING PigStorage(',') as (id:int, firstname:chararray, lastname:chararray, phone:chararray, city:chararray);
Let us now merge the contents of these two relations using the UNION operator as shown below.
student = UNION student1, student2;

Output

 

Verify the relation student using the DUMP operator as shown below.
Dump student;
It will display the following output, displaying the contents of the relation student.
(1,Rajiv,Reddy,9848022337,Hyderabad)
(2,siddarth,Battacharya,9848022338,Kolkata)
(3,Rajesh,Khanna,9848022339,Delhi)
(4,Preethi,Agarwal,9848022330,Pune)
(5,Trupthi,Mohanthy,9848022336,Bhuwaneshwar)
(6,Archana,Mishra,9848022335,Chennai)
(7,Komal,Nayak,9848022334,trivendram)
(8,Bharathi,Nambiayar,9848022333,Chennai)

Split Operator
The Split operator is used to split a relation into two or more relations.

Syntax

Given below is the syntax of the SPLIT operator.
grunt> SPLIT Relation1_name INTO Relation2_name IF (condition1), Relation2_name (condition2), 

Example

Assume that we have a file named student_details.txt in the HDFS directory /pig_data/ as shown below.

student_details.txt
001,Rajiv,Reddy,21,9848022337,Hyderabad
002,siddarth,Battacharya,22,9848022338,Kolkata
003,Rajesh,Khanna,22,9848022339,Delhi
004,Preethi,Agarwal,21,9848022330,Pune
005,Trupthi,Mohanthy,23,9848022336,Bhuwaneshwar
006,Archana,Mishra,23,9848022335,Chennai
007,Komal,Nayak,24,9848022334,trivendram
008,Bharathi,Nambiayar,24,9848022333,Chennai 
And we have loaded this file into Pig with the schema name student_details as shown below.
student_details = LOAD 'hdfs://localhost:9000/pig_data/student_details.txt' USING PigStorage(',')as (id:int, firstname:chararray, lastname:chararray, age:int, phone:chararray, city:chararray); 
Let us now split the relation into two, one listing the employees of age less than 23, and the other listing the employees having the age between 22 and 25.
SPLIT student_details into student_details1 if age<23, student_details2 if (22<age and age<25);

Output

Verify the relations student_details1 and student_details2 using the DUMP operator as shown below.
Dump student_details1;
Dump student_details2;
It will produce the following output, displaying the contents of the relations student_details1 and student_details2 respectively.

Dump student_details1;
(1,Rajiv,Reddy,21,9848022337,Hyderabad)
(2,siddarth,Battacharya,22,9848022338,Kolkata)
(3,Rajesh,Khanna,22,9848022339,Delhi)
(4,Preethi,Agarwal,21,9848022330,Pune)

Dump student_details2;
(5,Trupthi,Mohanthy,23,9848022336,Bhuwaneshwar)
(6,Archana,Mishra,23,9848022335,Chennai)
(7,Komal,Nayak,24,9848022334,trivendram) (8,Bharathi,Nambiayar,24,9848022333,Chennai)




Filter Operator
The filter operator is used to select the required tuples from a relation based on a condition.

Syntax

Given below is the syntax of the FILTER operator.
grunt> Relation2_name = FILTER Relation1_name BY (condition);

Example

Assume that we have a file named student_details.txt in the HDFS directory /pig_data/ as shown below. student_details.txt
001,Rajiv,Reddy,21,9848022337,Hyderabad
002,siddarth,Battacharya,22,9848022338,Kolkata
003,Rajesh,Khanna,22,9848022339,Delhi
004,Preethi,Agarwal,21,9848022330,Pune
005,Trupthi,Mohanthy,23,9848022336,Bhuwaneshwar
006,Archana,Mishra,23,9848022335,Chennai
007,Komal,Nayak,24,9848022334,trivendram
008,Bharathi,Nambiayar,24,9848022333,Chennai 
And we have loaded this file into Pig with the schema name student_details as shown below.
student_details = LOAD 'hdfs://localhost:9000/pig_data/student_details.txt' USING PigStorage(',')as (id:int, firstname:chararray, lastname:chararray, age:int, phone:chararray, city:chararray); 
Let us now use the Filter operator to get the details of the students who belong to the city Chennai.
filter_data = FILTER student_details BY city == 'Chennai';

Output

Verify the relation filter_data using the DUMP operator as shown below.
Dump filter_data;
It will produce the following filter_data as follows.
(6,Archana,Mishra,23,9848022335,Chennai)
(8,Bharathi,Nambiayar,24,9848022333,Chennai)

Distinct operator
The Distinct operator is used to remove redundant (duplicate) tuples from a relation.

Syntax

Given below is the syntax of the DISTINCT operator.
grunt> Relation_name2 = DISTINCT Relatin_name1;

Example

Assume that we have a file named student_details.txt in the HDFS directory /pig_data/ as shown below. student_details.txt
001,Rajiv,Reddy,9848022337,Hyderabad
002,siddarth,Battacharya,9848022338,Kolkata
002,siddarth,Battacharya,9848022338,Kolkata
003,Rajesh,Khanna,9848022339,Delhi
003,Rajesh,Khanna,9848022339,Delhi
004,Preethi,Agarwal,9848022330,Pune
005,Trupthi,Mohanthy,9848022336,Bhuwaneshwar
006,Archana,Mishra,9848022335,Chennai
006,Archana,Mishra,9848022335,Chennai
And we have loaded this file into Pig with the schema name student_details as shown below.
student_details = LOAD 'hdfs://localhost:9000/pig_data/student_details.txt' USING PigStorage(',') as (id:int, firstname:chararray, lastname:chararray, phone:chararray, city:chararray);      
Let us now remove the redundant (duplicate) tuples from the relation named student_details using the DISTINCT operator, and store it as another relation named data as shown below.
distinct_data = DISTINCT student_details;

OUTPUT

Verify the relation distinct_data using the DUMP operator as shown below.
Dump distinct_data;
It will produce the following distinct_data as follows.
(1,Rajiv,Reddy,9848022337,Hyderabad)
(2,siddarth,Battacharya,9848022338,Kolkata)
(3,Rajesh,Khanna,9848022339,Delhi)
(4,Preethi,Agarwal,9848022330,Pune)
(5,Trupthi,Mohanthy,9848022336,Bhuwaneshwar)
(6,Archana,Mishra,9848022335,Chennai) 
Foreach operator
The FOREACH operator is used to generate specified data transformations based on the column data.

Syntax

Given below is the syntax of foreach operator.
grunt> Relation_name2 = FOREACH Relatin_name1 GENERATE (required data);

Example

Assume that we have a file named student_details.txt in the HDFS directory /pig_data/ as shown below. student_details.txt
001,Rajiv,Reddy,21,9848022337,Hyderabad
002,siddarth,Battacharya,22,9848022338,Kolkata
003,Rajesh,Khanna,22,9848022339,Delhi
004,Preethi,Agarwal,21,9848022330,Pune
005,Trupthi,Mohanthy,23,9848022336,Bhuwaneshwar
006,Archana,Mishra,23,9848022335,Chennai
007,Komal,Nayak,24,9848022334,trivendram
008,Bharathi,Nambiayar,24,9848022333,Chennai
And we have loaded this file into Pig with the schema name student_details as shown below.
student_details = LOAD 'hdfs://localhost:9000/pig_data/student_details.txt' USING PigStorage(',')as (id:int, firstname:chararray, lastname:chararray,age:int, phone:chararray, city:chararray); 
Let us now get the id, age, and city values of each student from the relation student_details and store it into another relation named data using the foreach operator as shown below.
foreach_data = FOREACH student_details GENERATE id,age,city;

Out Put:

Verify the relation foreach_data using the DUMP operator as shown below.
Dump foreach_data;
(1,21,Hyderabad)
(2,22,Kolkata)
(3,22,Delhi)
(4,21,Pune)
(5,23,Bhuwaneshwar)
(6,23,Chennai)
(7,24,trivendram)
(8,24,Chennai)
Order By Operator
The ORDER BY operator is used to display the contents of a relation in a sorted order based on one or more fields.

Syntax           

Given below is the syntax of the ORDER BY operator.
grunt> Relation_name2 = ORDER Relatin_name1 BY (ASC|DESC);     

Example

Assume that we have a file named student_details.txt in the HDFS directory /pig_data/ as shown below. student_details.txt
001,Rajiv,Reddy,21,9848022337,Hyderabad
002,siddarth,Battacharya,22,9848022338,Kolkata
003,Rajesh,Khanna,22,9848022339,Delhi
004,Preethi,Agarwal,21,9848022330,Pune
005,Trupthi,Mohanthy,23,9848022336,Bhuwaneshwar
006,Archana,Mishra,23,9848022335,Chennai
007,Komal,Nayak,24,9848022334,trivendram
008,Bharathi,Nambiayar,24,9848022333,Chennai
And we have loaded this file into Pig with the schema name student_details as shown below.
student_details = LOAD 'hdfs://localhost:9000/pig_data/student_details.txt' USING PigStorage(',')as (id:int, firstname:chararray, lastname:chararray,age:int, phone:chararray, city:chararray); 
Let us now sort the relation in a descending order based on the age of the student and store it into another relation named data using the ORDER BY operator as shown below.
order_by_data = ORDER student_details BY age DESC;

Output

Verify the relation order_by_data using the DUMP operator as shown below.
Dump order_by_data;
It will produce the following output, displaying the contents of the relation order_by_data.
(8,Bharathi,Nambiayar,24,9848022333,Chennai)
(7,Komal,Nayak,24,9848022334,trivendram)
(6,Archana,Mishra,23,9848022335,Chennai)
(5,Trupthi,Mohanthy,23,9848022336,Bhuwaneshwar)
(3,Rajesh,Khanna,22,9848022339,Delhi)
(2,siddarth,Battacharya,22,9848022338,Kolkata)
(4,Preethi,Agarwal,21,9848022330,Pune)
(1,Rajiv,Reddy,21,9848022337,Hyderabad)
 Limit Operator
The LIMIT operator is used to get a limited number of tuples from a relation.                                                                                           

Syntax           

Given below is the syntax of the LIMIT operator.
grunt> Result = LIMIT Relation_name required number of tuples;

Example

Assume that we have a file named student_details.txt in the HDFS directory /pig_data/ as shown below. student_details.txt
001,Rajiv,Reddy,21,9848022337,Hyderabad
002,siddarth,Battacharya,22,9848022338,Kolkata
003,Rajesh,Khanna,22,9848022339,Delhi
004,Preethi,Agarwal,21,9848022330,Pune
005,Trupthi,Mohanthy,23,9848022336,Bhuwaneshwar
006,Archana,Mishra,23,9848022335,Chennai
007,Komal,Nayak,24,9848022334,trivendram
008,Bharathi,Nambiayar,24,9848022333,Chennai
And we have loaded this file into Pig with the schema name student_details as shown below.
student_details = LOAD 'hdfs://localhost:9000/pig_data/student_details.txt' USING PigStorage(',')as (id:int, firstname:chararray, lastname:chararray,age:int, phone:chararray, city:chararray); 
Now, let’s sort the relation in descending order based on the age of the student and store it into another relation named limit_data using the ORDER BY operator as shown below.
limit_data = LIMIT student_details 4;

Output

Verify the relation limit_data using the DUMP operator as shown below.
Dump limit_data;  
It will produce the following output, displaying the contents of the relation limit_data as follows.
(1,Rajiv,Reddy,21,9848022337,Hyderabad)
(2,siddarth,Battacharya,22,9848022338,Kolkata)
(3,Rajesh,Khanna,22,9848022339,Delhi)
(4,Preethi,Agarwal,21,9848022330,Pune)      





No comments:

Post a Comment