All Courses
All Courses
Courses by Software
Courses by Semester
Courses by Domain
Tool-focused Courses
Machine learning
POPULAR COURSES
Success Stories
AIM: TO PERFORM CURVE FITTING OPERATIONS USING PYTHON OBJECTIVES: 1. EXPLAIN THE ROLE OF 'popt' &'pcov' IN THE curve_fit FUNCTION. 2. EXPLAIN THE ROLE OF 'np.array(temperature)' WHICH IS BEING USED IN THE CODE. 3. EXPLAIN THE ROLE OF '*' IN *popt. 4. TO PERFORM CURVE-FITTING…
Sagar Biswas
updated on 22 Sep 2022
AIM: TO PERFORM CURVE FITTING OPERATIONS USING PYTHON
OBJECTIVES:
1. EXPLAIN THE ROLE OF 'popt' &'pcov' IN THE curve_fit FUNCTION.
2. EXPLAIN THE ROLE OF 'np.array(temperature)' WHICH IS BEING USED IN THE CODE.
3. EXPLAIN THE ROLE OF '*' IN *popt.
4. TO PERFORM CURVE-FITTING OPERATIONS BY WRITING PYTHON CODES TO FIT A LINEAR & A CUBIC POLYNOMIAL FOR SPECIFIC HEAT(Cp) DATA & PLOT THE FIT CURVE FOR THE LINEAR AND CUBIC POLYNOMIAL VALUES WITH RAW DATA POINTS.
5. STATE THE STEPS THAT CAN BE TAKEN TO IMPROVE THE QUALITY OF THE CURVE-FITTING OPERATION.
6. USING PYTHON, EMPIRICALLY SHOW THE FITNESS FOR THE OBTAINED CURVES.
THEORY:
I) CURVE FITTING: Curve fitting examines the relationship between one or more predictors (independent variables) and a response variable (dependent variable), intending to define a "best fit" model of the relationship.
It is a type of optimization technique that finds an optimal set of parameters for a defined function that best fits a given set of observations. Unlike supervised learning, curve fitting requires us to define the function that maps examples of inputs to outputs.
Curve fitting involves first defining the functional form of the mapping function (also called the basis function or objective function), then searching for the parameters to the function that result in the minimum error.
Error is calculated by using the observations from the domain and passing the inputs to our candidate mapping function and calculating the output, then comparing the calculated output to the observed output.
Once fit, we can use the mapping function to interpolate or extrapolate new points in the domain. It is common to run a sequence of input values through the mapping function to calculate a sequence of outputs, then create a line plot of the result to show how output varies with input and how well the line fits the observed points.
II) LINEAR CURVE FITTING: Linear curve fitting, or linear regression, is when the data is fit to a straight line. Although there might be some curve to your data, a straight line provides a reasonable enough fit to make predictions. Since the equation of a generic straight line is always given by f(x)= a x + b, the question becomes: what values of 'a' and 'b' will give us the best fit line for our data
Considering the vertical distance from each point to a prospective line as an error, and summing them up over our range, gives us a concrete number that expresses how far from ‘best’ the prospective line is. A line that provides a minimum error can be considered the best straight line.
Since it’s the distance from our points to the line we’re interested in—whether it is positive or negative distance is not relevant—we can square the distance in our error calculations. This will also allow us to weigh greater errors more heavily. This method is called the least squares approach.
III) POLYNOMIAL CURVE FITTING: Polynomial curve fitting is when we fit our data to the graph of a polynomial function. The least squares method can be used to find the polynomial, of a given degree, that has a minimum total error.
MAIN REPORT:
1. POPT & PCOV: The return value 'popt' contains the best-fit values of the parameters which in our case are the respective coefficients. The return value 'pcov' contains the Covariance (error) Matrix for the fit parameters. From them, we can determine the standard deviations of the parameters(The standard deviations are the square roots of the diagonal values). We can also determine the correlation between the fit parameters.
Covariance is calculated between two different variables. Its purpose is to find the value that indicates how these two variables vary together. The values of both variables are multiplied by taking the difference from the mean.
As covariance can only be calculated between two variables, covariance matrices stand for representing covariance values of each pair of variables in multivariate data.
For our case, if 'temperature' is considered as 'x' and 'specific heat' is considered as 'y' in our data set then its covariance matrix looks like as shown below:
Two-Dimensional Covariance Matrix
Three-Dimensional Covariance Matrix
It is a symmetric matrix that shows the covariances of each pair of variables. These values in the covariance matrix show the distribution magnitude and direction of multivariate data in multidimensional space. By controlling these values we can have information about how data spread among two dimensions.
2. Role of "np.array(temperature)" in our code: It is used in our code by importing & using the NumPy module to convert the values of temperature stored inside a data type known as 'list' named as 'temperature' in our code into an array so that we can perform operation like assigning powers to it using 'pow' command which can be performed the on list data type.
3. Role of '*' in *popt: The star in *popt unpacks the popt array so the respective optimized parameter values become arguments to the function.
4. PYTHON CODE TO FIT A LINEAR & CUBIC POLYNOMIAL FOR THE GIVEN DATA:
I) CODE FOR LINEAR CURVE FITTING:
# CURVE FITTING OPERATION FOR LINEAR POLYNOMIAL
# BY SAGAR BISWAS
# DRAG THE SCREEN LEFT & RIGHT TO READ THE WHOLE COMMENTS
import math
import matplotlib.pyplot as plt
import numpy as np
from scipy.optimize import curve_fit
# Function for Linear Polynomial
def function_0(t, a, b): # As Degree is 1, Hence No. of Coefficients = 2
return a * t + b # Creating a Mapping Function for Linear Polynomial
# Reading the values of Temperature and Specific Heat from the file
def read_file(): # read_file() function is used to read a file line-by-line and returns it in a table
temperature = [] # Creating an Empty List to store the values of Temperature later
cp = [] # Creating an Empty List to store the values of Specific Heat later
for line in open('data', 'r'): # Using FOR LOOP to open and read through each line from the file named as 'data'
values = line.split(',') # Splitting the values on each line whenever there's a ',' sign between values
temperature.append(float(values[0])) # Using index operator to extract the values of temp. from each line and storing it in previously empty list 'temperature'
cp.append(float(values[1])) # Using index operator to extract the values of temp. from each line and storing it in previously empty list 'temperature'
return [temperature, cp] # returning all the values of temperature & specific heat back into the function
# Main Program
temperature, cp = read_file() # Reading the file for values of temperature & specific heat
popt, pcov = curve_fit(function_0, temperature, cp)
# The function curve_fit() returns the optimal values for the mapping function which is coefficient values here.
# It also returns a covariance matrix for the estimated parameters
fit_cp_l = function_0(np.array(temperature),*popt) # Updating Specific-Heat Values using Temperature and Coefficient values
# The star in *popt unpacks the popt array so the respective optimized parameter values become arguments to the function
plt.plot(temperature, cp, 'k--')
plt.plot(temperature, fit_cp_l, color='red', linewidth=1)
plt.legend(['Actual data', 'Curve fit'])
plt.xlabel('Temperature --->')
plt.ylabel('Specific Heat(Cp)--->')
plt.title('Linear Curve-Fitting')
plt.show()
'''
Measuring the Fitness of the Curves
SSE Sum of Error is Squared
Error is the difference between the Observed-Value & the Predicted-Value.
SSR Sum of Squares of the Regression
SSR is stated as the Sum of the Differences between the predicted value and the mean of the dependent variable.
If this value of SSR is equal to the sum of squares total, it means our Regression-Model have captured all the Observed Variability and is Perfect.
SST = SSE + SSR Sum of Squares in Total
The Squared differences between the Observed Dependent Variable and its Mean.
R^2 = SSR/SST It is a relative measure and takes values ranging from 0 to 1.
An R-squared of 0 means our regression line explains none of the variability of the data.
An R-squared of 1 would mean our model explains the entire variability of the data.
It depends on the complexity of the topic and how many variables are believed to be in play.
RMSE = (SSE/l)^0.5 It is square root of the variance of the residuals. It indicates the absolute fit of the model
to the data. It shows how close the observed data points are to the model's Predicted-Values
Finding mean of Cp data
Mean = ( Sum of All-Elements)/(Total No. of Elements)
'''
s = np.sum(cp) # Sum of all the values in Cp
l = np.size(cp) # Length of all the values in Cp
m = s / l # Mean value for Cp
print('Mean value for Specific Heat(Cp) :', m)
# For Linear-Curve Fit
linear_sse = 0 # Initial Values for SSE
linear_ssr = 0 # Initial Values for SSR
for i in range(l):
linear_error = abs((np.sum((cp[i] - fit_cp_l[i])))) # Error-Values for Initial & Final values of Specific-Heat
linear_sse = linear_sse + pow(linear_error, 2) # Final values for SSE
linear_ssr = linear_ssr + np.sum(pow((fit_cp_l[i] - m), 2)) # Final values for SSR
linear_sst = linear_sse + linear_ssr # Final values for SST
print('Linear_SST :', linear_sst)
linear_R2 = linear_ssr / linear_sst # Final values for R-Square
print('Linear_R2 :', linear_R2)
linear_rmse = pow((linear_sse / l), 0.5) # Final values for Root Mean Square Error(RMSE)
print('Linear_RMSE :', linear_rmse)
Above Python Code for Curve-Fitting Operation with Linear Polynomial is explained thoroughly using comments inside the code itself.
II) CODE FOR CUBIC CURVE FITTING:
# CURVE FITTING OPERATION FOR CUBIC POLYNOMIAL
# BY SAGAR BISWAS
# DRAG THE SCREEN LEFT & RIGHT TO READ THE WHOLE COMMENTS
import math
import matplotlib.pyplot as plt
import numpy as np
from scipy.optimize import curve_fit
# Function for Cubic Polynomial
def func_2(t, a, b, c, d): # As Degree is 3, Hence No. of Coefficients = 4
return a * pow(t, 3) + b * pow(t, 2) + c * t + d # Creating a Mapping Function for Cubic Polynomial
# Reading the values of Temperature and Specific Heat from the file
def read_file(): # read_file() function is used to read a file line-by-line and returns it in a table
temperature = [] # Creating an Empty List to store the values of Temperature later
cp = [] # Creating an Empty List to store the values of Specific Heat later
for line in open('data', 'r'): # Using FOR LOOP to open and read through each line from the file named as 'data'
values = line.split(',') # Splitting the values on each line whenever there's a ',' sign between values
temperature.append(float(values[0])) # Using index operator to extract the values of temp. from each line and storing it in previously empty list 'temperature'
cp.append(float(values[1])) # Using index operator to extract the values of temp. from each line and storing it in previously empty list 'temperature'
return [temperature, cp] # returning all the values of temperature & specific heat back into the function
# Main Program
temperature, cp = read_file() # Reading the file for values of temperature & specific heat
popt, pcov = curve_fit(func_2,temperature,cp)
# The function curve_fit() returns the optimal values for the mapping function which is coefficient values here.
# It also returns a covariance matrix for the estimated parameters
fit_cp_2 = func_2(np.array(temperature), *popt) # Updating Specific-Heat Values using Temperature and Coefficient values
# The star in *popt unpacks the popt array so the respective optimized parameter values become arguments to the function
plt.figure(3)
plt.plot(temperature,cp,'k--')
plt.plot(temperature,fit_cp_2,color='red',linewidth = 1)
plt.legend(['Actual data','Curve fit'])
plt.xlabel('Temperature--->')
plt.ylabel('Specific Heat(Cp)--->')
plt.title('Cubic Curve fitting')
plt.show()
'''
Measuring the Fitness of the Curves
SSE Sum of Error is Squared
Error is the difference between the Observed-Value & the Predicted-Value.
SSR Sum of Squares of the Regression
SSR is stated as the Sum of the Differences between the predicted value and the mean of the dependent variable.
If this value of SSR is equal to the sum of squares total, it means our Regression-Model have captured all the Observed Variability and is Perfect.
SST = SSE + SSR Sum of Squares in Total
The Squared differences between the Observed Dependent Variable and its Mean.
R^2 = SSR/SST It is a relative measure and takes values ranging from 0 to 1.
An R-squared of 0 means our regression line explains none of the variability of the data.
An R-squared of 1 would mean our model explains the entire variability of the data.
It depends on the complexity of the topic and how many variables are believed to be in play.
RMSE = (SSE/l)^0.5 It is square root of the variance of the residuals. It indicates the absolute fit of the model
to the data. It shows how close the observed data points are to the model's Predicted-Values
Finding mean of Cp data
Mean = ( Sum of All-Elements)/(Total No. of Elements)
'''
s = np.sum(cp) # Sum of all the values in Cp
l = np.size(cp) # Length of all the values in Cp
m = s / l # Mean value for Cp
print('Mean value for Specific Heat(Cp) :', m)
# For Cubic-Curve Fit
cubic_sse = 0 # Initial Values for SSE
cubic_ssr = 0 # Initial Values for SSR
for i in range(l):
cubic_error = abs((np.sum((cp[i] - fit_cp_2[i])))) # Error-Values for Initial & Final values of Specific-Heat
cubic_sse = cubic_sse + pow(cubic_error, 2) # Final values for SSE
cubic_ssr = cubic_ssr + np.sum(pow((fit_cp_2[i] - m), 2)) # Final values for SSR
cubic_sst = cubic_sse + cubic_ssr # Final values for SST
print('Cubic_SST :', cubic_sst)
cubic_R2 = cubic_ssr / cubic_sst # Final values for R-Square
print('Cubic_R2 :', cubic_R2)
cubic_rmse = pow((cubic_sse / l), 0.5) # Final values for Root Mean Square Error(RMSE)
print('Cubic_RMSE :', cubic_rmse)
Above Python Code for Curve-Fitting Operation with Cubic Polynomial is explained thoroughly using comments inside the code itself.
RESULTS:
A) OUTPUT FOR LINEAR CURVE-FITTING OPERATION:
B) PLOTTING RESULTS FOR LINEAR CURVE-FITTING:
C) OUTPUT FOR CUBIC CURVE-FITTING OPERATION:
D) PLOTTING RESULTS FOR CUBIC CURVE-FITTING:
From the above results, it is evident that the cubic curve-fitting operation has delivered a better curve-fit in comparison to the linear curve-fitting operation.
5. STEPS THAT ARE NEEDED TO BE TAKEN TO ENSURE THAT THE CURVE FITS PERFECTLY:
If it is evident from the plot that the curve-fitting operation is not fitting the curve as needed then one can increase the degree of the polynomial being used for the operation. Although, higher order polynomial will not always result in proper fit as using higher order polynomial can lead to oscillatory results after a certain limit. At that point, we can use low-order polynomials with supplementary techniques such as the 'Splitwise Method' to obtain a better fit.
6. SHOWING EMPIRICALLY WHY OUR RESULTS FOR CUBIC CURVE FITTING ARE BETTER:
It is a symmetric matrix that shows the covariances of each pair of variables. These values in the covariance matrix show the distribution magnitude and direction of multivariate data in multidimensional space. By controlling these values we can have information about how data spread among two dimensions.
From the above results, it is evident that using cubic polynomials we can obtain better curve-fitting results for the given data. As the R-squared value for a cubic polynomial is greater than the R-squared value for a linear one, hence we can conclude that the cubic model explains the entire variability of the data to a better extent.
We know that RMSE is the square root of the variance of the residuals. It indicates the absolute fit of the model to the data–how close the observed data points are to the model’s predicted values. Whereas R-squared is a relative measure of fit, RMSE is an absolute measure of fit.
Lower values of RMSE indicate better fit and from our output results, it is evident that the value of RMSE is lower in the case of cubic curve fitting operation is 5.4227 compared to linear curve fitting operation where the value of RMSE is 25.9990 which is significantly higher than the former.
RMSE is a good measure of how accurately the model predicts the response. It’s the most important criterion for fit if the main purpose of the model is prediction.
Leave a comment
Thanks for choosing to leave a comment. Please keep in mind that all the comments are moderated as per our comment policy, and your email will not be published for privacy reasons. Please leave a personal & meaningful conversation.
Other comments...
FINAL GD&T PROJECT: BUTTERFLY VALVE WITH GD&T IN SIEMENS NX CAD
OBJECTIVE: The primary objective of this project is to design and model individual components of a butterfly valve using the provided drawings while applying Geometric Dimensioning and Tolerancing (GD&T) principles to each component within the Siemens NX CAD environment. Upon successfully creating the individual…
13 May 2024 10:55 AM IST
WIRING HARNESS FLATTENING & DRAWING WORKBENCH
OBJECTIVE: Take the harness assembly from the previously completed challenge and flatten it. Position this flattened view on the drawing sheet. It’s important to make sure that bundles with protective coverings are visually distinct in the drawing view. This step is part of our ongoing process to create a drawing…
13 May 2024 09:30 AM IST
FINAL PROJECT TWO: BACKDOOR WIRING HARNESS USING CATIA V5
OBJECTIVE: This project aims to demonstrate the practical application of wiring harness routing and design principles on a car's backdoor/tailgate using CATIA V5 software. The main objective is to showcase the implementation of industry best practices and packaging rules studied throughout the course by creating a properly…
15 Apr 2024 07:58 AM IST
FINAL PROJECT ONE: V16 ENGINE WIRING HARNESS ROUTING, PACKAGING, FLATTENING AND DRAWING
OBJECTIVE STATEMENT: The primary objective of this assignment is to design and route a comprehensive wiring harness for a given engine using CATIA V5 software. The design process will encompass applying industry-standard packaging rules, best practices, and guidelines acquired through the coursework. Particular emphasis…
08 Mar 2024 06:46 AM IST
Related Courses
Skill-Lync offers industry relevant advanced engineering courses for engineering students by partnering with industry experts.
© 2025 Skill-Lync Inc. All Rights Reserved.