书名：Matplotlib 3.0 Cookbook
作者名：Srinivasa Rao Poladi
本章字数：313字
更新时间：2025-04-04 16:06:37

There's more...

On the y axis, instead of plotting frequency, you can plot the percentage of the sum of all the entries in the grp_exp list in each bin, by specifying density=1 in plt.hist(). You can also plot approximate normal distribution using the mean and standard deviation of this data to see how well this distribution follows a normal distribution:

Create a NumPy array with work (years) of participants of a lateral training class:

grp_exp = np.array([12, 15, 13, 20, 19, 20, 11, 19, 11, 12, 19, 13,   
                    12, 10, 6, 19, 3, 1, 1, 0, 4, 4, 6, 5, 3, 7, 12, 
                    7, 9, 8, 12, 11, 11, 18, 19, 18, 19, 3, 6, 5, 6, 
                    9, 11, 10, 14, 14, 16, 17, 17, 19, 0, 2, 0, 3, 
                    1, 4, 6, 6, 8, 7, 7, 6, 7, 11, 11, 10, 11, 10, 
                    13, 13, 15, 18, 20, 19, 1, 10, 8, 16, 19, 19, 
                    17, 16, 11, 1, 10, 13, 15, 3, 8, 6, 9, 10, 15, 
                    19, 2, 4, 5, 6, 9, 11, 10, 9, 10, 9, 15, 16, 18, 
                    13])

Plot the distribution of experience:

nbins = 21
n, bins, patches = plt.hist(grp_exp, bins = nbins, density=1)

Add axis labels:

plt.xlabel("Experience in years")
plt.ylabel("Percentage")
plt.title("Distribution of Experience in a Lateral Training 
           Program")

Compute the mean (mu) and the standard deviation (sigma) for grp_exp data:

mu = grp_exp.mean()
sigma = grp_exp.std()

Add a best-fit line for normal distribution with mu and sigma computed:

y = ((1 / (np.sqrt(2 * np.pi) * sigma)) * np.exp(-0.5 * (1 / sigma 
                                        * (bins - mu))**2))
plt.plot(bins, y, '--')

Display the plot on the screen:

plt.show()

Here is how the output plot looks:

Clearly, the data is not following a normal distribution, as there too many bins way above or way below the best fit for a normal curve.