Menu

Executive Programs

Workshops

Projects

Blogs

Careers

Student Reviews


For Business / Universities

Corporate Training

Hire from US

Academic Up-skilling



All Courses

Choose a category

Loading...

All Courses

All Courses

logo

CSE

Uploaded on

29 Dec 2022

How To Load Files Into Python?

logo

Skill-Lync

Data can be read from a number of sources, including files and databases, using Python. The.txt and.csv file types are two that are frequently utilised. Using either the Python CSV library or built-in import and export capabilities, you can import and export files. 

How to Import Data into Python? 

First we need to import the data into python. Importing the data as a dataframe helps us to handle the data much easily. The pandas module of Python is specially made to handle data as a data frame.

What are the Different Data Types in Python?

These are the 14 file types that can be opened by Pandas.

Comma-separated values (CSV)

pandas.read_csv(filepath_or_buffer, sep=NoDefault.no_default, delimiter=None, header='infer', names=NoDefault.no_default, index_col=None, usecols=None, squeeze=None, prefix=NoDefault.no_default, mangle_dupe_cols=True, dtype=None, engine=None, converters=None, true_values=None, false_values=None, skipinitialspace=False, skiprows=None, skipfooter=0, nrows=None, na_values=None, keep_default_na=True, na_filter=True, verbose=False, skip_blank_lines=True, parse_dates=None, infer_datetime_format=False, keep_date_col=False, date_parser=None, dayfirst=False, cache_dates=True, iterator=False, chunksize=None, compression='infer', thousands=None, decimal='.', lineterminator=None, quotechar='"', quoting=0, doublequote=True, escapechar=None, comment=None, encoding=None, encoding_errors='strict', dialect=None, error_bad_lines=None, warn_bad_lines=None, on_bad_lines=None, delim_whitespace=False, low_memory=True, memory_map=False, float_precision=None, storage_options=None)

import pandas as pd

pd.read_csv('sometext.csv')

XLSX

pandas.read_excel(io, sheet_name=0, header=0, names=None, index_col=None, usecols=None, squeeze=None, dtype=None, engine=None, converters=None, true_values=None, false_values=None, skiprows=None, nrows=None, na_values=None, keep_default_na=True, na_filter=True, verbose=False, parse_dates=False, date_parser=None, thousands=None, decimal='.', comment=None, skipfooter=0, convert_float=None, mangle_dupe_cols=True, storage_options=None)

import pandas as pd

pd.read_excel('someexcelfile.xlsx')

Zip file

import zipfile

import pandas as pd

# read the dataset using the compression zip

df = pd.read_csv('test.zip',compression='zip')

# display dataset

print(df.head())

Plain Text (txt)

# importing pandas

import pandas as pd  

# read text file into pandas DataFrame

df = pd.read_csv("gfg.txt", sep=" ")

# display DataFrame

print(df)

JSON

import pandas as pd

file_df = pd.read_json('E:/datasets/filename.json')

file_df.head()

XML

import xml.etree.ElementTree as ET

import pandas as pd

xml_data=open('filename.xml','r').read()

root=ET.XML(xml_data)  

data=[]

cols = []

for i, child in enumerate(root):

    data.append([subchild.text for subchild in child])

    cols.append(child.tag)

df = pd. DataFrame(data).T

df.columns=cols

print(df)

HTML

import pandas as pd

from unicodedata import normalize

table_MN=pd.read_html('https://en.wikipedia.org/wiki/something')

Images

import pandas as pd

import numpy as np

import matplotlib.pyplot as plt

#create an image

imar = np.array([[[1.,0.],[0.,0.]],

                 [[0.,1.],[0.,1.]],

                 [[0.,0.],[1.,1.]]]).transpose()

plt.imsave('pic.jpg', imar)

#create dataframe

df = pd.DataFrame([[0,""]], columns=["Feature1","Feature2"])

# read the image

im = plt.imread('pic.jpg')

plt.imshow(im)

plt.show()

Hierachial Data Format

import pandas as pd

subjectsdata = {'Name': ['sravan', 'sravan', 'sravan', 'sravan',

                         'sravan', 'sravan', 'sravan', 'sravan',

                         'Ojaswi', 'Ojaswi', 'Ojaswi', 'Ojaswi',

                         'Ojaswi', 'Ojaswi', 'Ojaswi', 'Ojaswi',

                         'Rohith', 'Rohith', 'Rohith', 'Rohith',

                         'Rohith', 'Rohith', 'Rohith', 'Rohith'],

                  

                'college': ['VFSTRU', 'VFSTRU', 'VFSTRU', 'VFSTRU',

                            'VFSTRU', 'VFSTRU', 'VFSTRU', 'VFSTRU',

                            'VIT', 'VIT', 'VIT', 'VIT', 'VIT', 'VIT',

                            'VIT', 'VIT', 'IIT-Bhu', 'IIT-Bhu', 'IIT-Bhu',

                            'IIT-Bhu', 'IIT-Bhu', 'IIT-Bhu', 'IIT-Bhu',

                            'IIT-Bhu'],

                  

                'subject': ['java', 'dbms', 'dms', 'coa', 'python', 'dld',

                            'android', 'iot', 'java', 'dbms', 'dms', 'coa',

                            'python', 'dld', 'android', 'iot', 'java',

                            'dbms', 'dms', 'coa', 'python', 'dld', 'android',

                            'iot']

                }

df = pd.DataFrame(subjectsdata)

print(df)

PDF

from tabula import read_pdf

df = read_pdf('data.pdf')

DOCX

pip install pandas

pip install python-docx

import pandas as pd

from docx import Documentdocument = Document("<<docx file path>>")

data = [[cell.text for cell in row.cells] for row in table.rows]

df = pd.DataFrame(data)

MP3

import librosa

song_path = 'track1.mp3'

y,sr = librosa.load(song_path,sr=22050)

print(y)

This would be stored as a numpy array

MP4

import pylab

import imageio

filename = '/tmp/file.mp4'

vid = imageio.get_reader(filename,  'ffmpeg')

nums = [10, 287]

for num in nums:

    image = vid.get_data(num)

    fig = pylab.figure()

    fig.suptitle('image #{}'.format(num), fontsize=20)

    pylab.imshow(image)

pylab.show()

SQL

pandas.read_sql(sql, con, index_col=None, coerce_float=True, params=None, parse_dates=None, columns=None, chunksize=None)


Author

author

Navin Baskar


Author

blogdetails

Skill-Lync

img

Continue Reading

Related Blogs

Real-Time Applications of Python You Need to Know

Since 1991 when the Python language was developed, it has been used for various applications. Due to its simplicity and versatile nature, Python codes can help developers to complete the process of software development without much hassle.

CSE

16 May 2023


A Brief Introduction to Python: Its Features and Different IDEs

Python is an open-source programming language which means it is available on the official website, and anyone can make use of this technology free of cost. Since it is open-source, this means that the source code is also available to the public.

CSE

15 May 2023


Cybersecurity in Telecom: Protecting Networks & Data from Cyber Threats

Telecommunications networks support our digital society. They are, therefore, a top target for cyberattacks.

CSE

15 Apr 2023


Everything you Need to Know About Full-Stack Web Development

Are you interested in becoming a web developer? If so, you've come to the right place! This comprehensive guide to full-stack web development will give you all the information you need to start.

CSE

13 Apr 2023


Exploring the Latest Frameworks for Software Development

Are you looking for the latest and greatest tools for software development? Then you're in the right place! This blog post will explore the newest frameworks for software development, from the most popular to the most cutting-edge.

CSE

06 Apr 2023



Author

blogdetails

Skill-Lync

img

Continue Reading

Related Blogs

Real-Time Applications of Python You Need to Know

Since 1991 when the Python language was developed, it has been used for various applications. Due to its simplicity and versatile nature, Python codes can help developers to complete the process of software development without much hassle.

CSE

16 May 2023


A Brief Introduction to Python: Its Features and Different IDEs

Python is an open-source programming language which means it is available on the official website, and anyone can make use of this technology free of cost. Since it is open-source, this means that the source code is also available to the public.

CSE

15 May 2023


Cybersecurity in Telecom: Protecting Networks & Data from Cyber Threats

Telecommunications networks support our digital society. They are, therefore, a top target for cyberattacks.

CSE

15 Apr 2023


Everything you Need to Know About Full-Stack Web Development

Are you interested in becoming a web developer? If so, you've come to the right place! This comprehensive guide to full-stack web development will give you all the information you need to start.

CSE

13 Apr 2023


Exploring the Latest Frameworks for Software Development

Are you looking for the latest and greatest tools for software development? Then you're in the right place! This blog post will explore the newest frameworks for software development, from the most popular to the most cutting-edge.

CSE

06 Apr 2023


Book a Free Demo, now!

Related Courses

https://d28ljev2bhqcfz.cloudfront.net/maincourse/thumb/introduction-hev-matlab-simulink_1612262875.jpg
Introduction to Hybrid Electric Vehicle using MATLAB and Simulink
4.8
23 Hours of content
Electrical Domain
Know more
https://d28ljev2bhqcfz.cloudfront.net/maincourse/thumb/vehicle-dynamics-matlab_1636606203.png
4.8
37 Hours of content
Cae Domain
Showing 1 of 4 courses
Try our top engineering courses, projects & workshops today!Book a Live Demo