Data Operations in Python

Last updated on Dec 13 2021
Amarnath Garg

Table of Contents

Data Operations in Python

Python handles data of varied formats mainly through the 2 libraries, Pandas and Numpy. we’ve already seen the important features of those two libraries. during this blog we’ll see some basic examples from each of the libraries on the way to operate data.

Data Operations in Numpy

The most important object defined in NumPy is an N-dimensional array type called ndarray. It describes the gathering of things of an equivalent type. Items within the collection are often accessed employing a zero-based index. An instance of ndarray class are often constructed by different array creation routines described later within the tutorial. the essential ndarray is made using an array function in NumPy as follows −

numpy.array
Following are some examples on Numpy Data handling.
Example 1
# quite one dimensions
import numpy as np
a = np.array([[1, 2], [3, 4]])
print a
The output is as follows −
[[1, 2]
 [3, 4]]
Example 2

# minimum dimensions

import numpy as np
a = np.array([1, 2, 3,4,5], ndmin = 2)
print a
The output is as follows −
[[1, 2, 3, 4, 5]]
Example 3
# dtype parameter
import numpy as np
a = np.array([1, 2, 3], dtype = complex)
print a
The output is as follows −
[ 1.+0.j, 2.+0.j, 3.+0.j]
Data Operations in Pandas

Pandas handles data through Series,Data Frame, and Panel. we’ll see some examples from each of those .

Pandas Series

Series may be a one-dimensional labeled array capable of holding data of any type (integer, string, float, python objects, etc.). The axis labels are collectively called index. A pandas Series are often created using the subsequent constructor −

pandas.Series( data, index, dtype, copy)

Example

Here we create a series from a Numpy Array.

#import the pandas library and aliasing as pd
import pandas as pd
import numpy as np
data = np.array(['a','b','c','d'])
s = pd.Series(data)
print s

Its output is as follows −

0 a
1 b
2 c
3 d

dtype: object

Pandas DataFrame

A Data frame may be a two-dimensional arrangement , i.e., data is aligned during a tabular fashion in rows and columns. A pandas DataFrame are often created using the subsequent constructor −

pandas.DataFrame( data, index, columns, dtype, copy)

Let us now create an indexed DataFrame using arrays.

import pandas as pd
data = {'Name':['Tom', 'Jack', 'Steve', 'Ricky'],'Age':[28,34,29,42]}
df = pd.DataFrame(data, index=['rank1','rank2','rank3','rank4'])
print df
Its output is as follows −
 Age Name
rank1 28 Tom
rank2 34 Jack
rank3 29 Steve
rank4 42 Ricky
Pandas Panel
A panel may be a 3D container of data . The term Panel data springs from econometrics and is partially liable for the name pandas − pan(el)-da(ta)-s.

A Panel are often created using the subsequent constructor −

pandas.Panel(data, items, major_axis, minor_axis, dtype, copy)

In the below example we create a panel from dict of DataFrame Objects

#creating an empty panel
import pandas as pd
import numpy as np
data = {'Item1' : pd.DataFrame(np.random.randn(4, 3)),
 'Item2' : pd.DataFrame(np.random.randn(4, 2))}
p = pd.Panel(data)
print p

Its output is as follows −

Dimensions: 2 (items) x 4 (major_axis) x 5 (minor_axis)
Items axis: 0 to 1
Major_axis axis: 0 to three
Minor_axis axis: 0 to 4

Data Operations in Python

Python handles data of varied formats mainly through the 2 libraries, Pandas and Numpy. we’ve already seen the important features of those two libraries. during this blog we’ll see some basic examples from each of the libraries on the way to operate data.

Data Operations in Numpy

The most important object defined in NumPy is an N-dimensional array type called ndarray. It describes the gathering of things of an equivalent type. Items within the collection are often accessed employing a zero-based index. An instance of ndarray class are often constructed by different array creation routines described later within the tutorial. the essential ndarray is made using an array function in NumPy as follows −

numpy.array

Following are some examples on Numpy Data handling.

Example 1

# quite one dimensions
import numpy as np
a = np.array([[1, 2], [3, 4]])
print a
The output is as follows −
[[1, 2]
 [3, 4]]
Example 2
# minimum dimensions
import numpy as np
a = np.array([1, 2, 3,4,5], ndmin = 2)
print a
The output is as follows −
[[1, 2, 3, 4, 5]]
Example 3
# dtype parameter
import numpy as np
a = np.array([1, 2, 3], dtype = complex)
print a
The output is as follows −
[ 1.+0.j, 2.+0.j, 3.+0.j]

Data Operations in Pandas

Pandas handles data through Series,Data Frame, and Panel. we’ll see some examples from each of those .

Pandas Series

Series may be a one-dimensional labeled array capable of holding data of any type (integer, string, float, python objects, etc.). The axis labels are collectively called index. A pandas Series are often created using the subsequent constructor −

pandas.Series( data, index, dtype, copy)

Example

Here we create a series from a Numpy Array.

#import the pandas library and aliasing as pd
import pandas as pd
import numpy as np
data = np.array(['a','b','c','d'])
s = pd.Series(data)
print s

Its output is as follows −

0 a
1 b
2 c
3 d
dtype: object

Pandas DataFrame

A Data frame may be a two-dimensional arrangement , i.e., data is aligned during a tabular fashion in rows and columns. A pandas DataFrame are often created using the subsequent constructor −

pandas.DataFrame( data, index, columns, dtype, copy)

Let us now create an indexed DataFrame using arrays.

import pandas as pd
data = {'Name':['Tom', 'Jack', 'Steve', 'Ricky'],'Age':[28,34,29,42]}
df = pd.DataFrame(data, index=['rank1','rank2','rank3','rank4'])
print df

Its output is as follows −

 Age Name
rank1 28 Tom
rank2 34 Jack
rank3 29 Steve
rank4 42 Ricky

Pandas Panel

A panel may be a 3D container of data . The term Panel data springs from econometrics and is partially liable for the name pandas − pan(el)-da(ta)-s.

A Panel are often created using the subsequent constructor −

pandas.Panel(data, items, major_axis, minor_axis, dtype, copy)

In the below example we create a panel from dict of DataFrame Objects

#creating an empty panel
import pandas as pd
import numpy as np
data = {'Item1' : pd.DataFrame(np.random.randn(4, 3)),
 'Item2' : pd.DataFrame(np.random.randn(4, 2))}
p = pd.Panel(data)
print p

Its output is as follows −

Dimensions: 2 (items) x 4 (major_axis) x 5 (minor_axis)
Items axis: 0 to 1
Major_axis axis: 0 to three
Minor_axis axis: 0 to 4

So, this brings us to the end of blog. This Tecklearn ‘Data Operations in Python’ blog helps you with commonly asked questions if you are looking out for a job in Python Programming. If you wish to learn Python and build a career in Data Science domain, then check out our interactive, Python with Data Science Training, that comes with 24*7 support to guide you throughout your learning period. Please find the link for course details:

https://www.tecklearn.com/course/python-with-data-science/

Python with Data Science Training

About the Course

Python with Data Science training lets you master the concepts of the widely used and powerful programming language, Python. This Python Course will also help you master important Python programming concepts such as data operations, file operations, object-oriented programming and various Python libraries such as Pandas, NumPy, Matplotlib which are essential for Data Science. You will work on real-world projects in the domain of Python and apply it for various domains of Big Data, Data Science and Machine Learning.

Why Should you take Python with Data Science Training?

  • Python is the preferred language for new technologies such as Data Science and Machine Learning.
  • Average salary of Python Certified Developer is $123,656 per annum – Indeed.com
  • Python is by far the most popular language for data science. Python held 65.6% of the data science market.

What you will Learn in this Course?

Introduction to Python

  • Define Python
  • Understand the need for Programming
  • Know why to choose Python over other languages
  • Setup Python environment
  • Understand Various Python concepts – Variables, Data Types Operators, Conditional Statements and Loops
  • Illustrate String formatting
  • Understand Command Line Parameters and Flow control

Python Environment Setup and Essentials

  • Python installation
  • Windows, Mac & Linux distribution for Anaconda Python
  • Deploying Python IDE
  • Basic Python commands, data types, variables, keywords and more

Python language Basic Constructs

  • Looping in Python
  • Data Structures: List, Tuple, Dictionary, Set
  • First Python program
  • Write a Python Function (with and without parameters)
  • Create a member function and a variable
  • Tuple
  • Dictionary
  • Set and Frozen Set
  • Lambda function

OOP (Object Oriented Programming) in Python

  • Object-Oriented Concepts

Working with Modules, Handling Exceptions and File Handling

  • Standard Libraries
  • Modules Used in Python (OS, Sys, Date and Time etc.)
  • The Import statements
  • Module search path
  • Package installation ways
  • Errors and Exception Handling
  • Handling multiple exceptions

Introduction to NumPy

  • Introduction to arrays and matrices
  • Indexing of array, datatypes, broadcasting of array math
  • Standard deviation, Conditional probability
  • Correlation and covariance
  • NumPy Exercise Solution

Introduction to Pandas

  • Pandas for data analysis and machine learning
  • Pandas for data analysis and machine learning Continued
  • Time series analysis
  • Linear regression
  • Logistic Regression
  • ROC Curve
  • Neural Network Implementation
  • K Means Clustering Method

Data Visualisation

  • Matplotlib library
  • Grids, axes, plots
  • Markers, colours, fonts and styling
  • Types of plots – bar graphs, pie charts, histograms
  • Contour plots

Data Manipulation

  • Perform function manipulations on Data objects
  • Perform Concatenation, Merging and Joining on DataFrames
  • Iterate through DataFrames
  • Explore Datasets and extract insights from it

 Scikit-Learn for Natural Language Processing

  • What is natural language processing, working with NLP on text data
  • Scikit-Learn for Natural Language Processing
  • The Scikit-Learn machine learning algorithms
  • Sentimental Analysis – Twitter

Introduction to Python for Hadoop

  • Deploying Python coding for MapReduce jobs on Hadoop framework.
  • Python for Apache Spark coding
  • Deploying Spark code with Python
  • Machine learning library of Spark MLlib
  • Deploying Spark MLlib for Classification, Clustering and Regression

Got a question for us? Please mention it in the comments section and we will get back to you.

 

 

0 responses on "Data Operations in Python"

Leave a Message

Your email address will not be published. Required fields are marked *