- Matplotlib 3.0 Cookbook
- Srinivasa Rao Poladi
- 313字
- 2025-04-04 16:06:37
There's more...
On the y axis, instead of plotting frequency, you can plot the percentage of the sum of all the entries in the grp_exp list in each bin, by specifying density=1 in plt.hist(). You can also plot approximate normal distribution using the mean and standard deviation of this data to see how well this distribution follows a normal distribution:
- Create a NumPy array with work (years) of participants of a lateral training class:
grp_exp = np.array([12, 15, 13, 20, 19, 20, 11, 19, 11, 12, 19, 13,
12, 10, 6, 19, 3, 1, 1, 0, 4, 4, 6, 5, 3, 7, 12,
7, 9, 8, 12, 11, 11, 18, 19, 18, 19, 3, 6, 5, 6,
9, 11, 10, 14, 14, 16, 17, 17, 19, 0, 2, 0, 3,
1, 4, 6, 6, 8, 7, 7, 6, 7, 11, 11, 10, 11, 10,
13, 13, 15, 18, 20, 19, 1, 10, 8, 16, 19, 19,
17, 16, 11, 1, 10, 13, 15, 3, 8, 6, 9, 10, 15,
19, 2, 4, 5, 6, 9, 11, 10, 9, 10, 9, 15, 16, 18,
13])
- Plot the distribution of experience:
nbins = 21
n, bins, patches = plt.hist(grp_exp, bins = nbins, density=1)
- Add axis labels:
plt.xlabel("Experience in years")
plt.ylabel("Percentage")
plt.title("Distribution of Experience in a Lateral Training
Program")
- Compute the mean (mu) and the standard deviation (sigma) for grp_exp data:
mu = grp_exp.mean()
sigma = grp_exp.std()
- Add a best-fit line for normal distribution with mu and sigma computed:
y = ((1 / (np.sqrt(2 * np.pi) * sigma)) * np.exp(-0.5 * (1 / sigma
* (bins - mu))**2))
plt.plot(bins, y, '--')
- Display the plot on the screen:
plt.show()
Here is how the output plot looks:

Clearly, the data is not following a normal distribution, as there too many bins way above or way below the best fit for a normal curve.