Author Archives: admin

Modelling sound absorption properties of broom fibers using artificial neural networks

he use of broom to produce fibers has ancient roots. The Greeks appreciated its resistance to water and for this reason they used it to manufacture sailing ropes. But broom fiber was also appreciated for its sound absorption qualities. In this study, a new methodology was developed for the numerical modeling of the acoustic behavior of broom fibers. First, the characteristics of the different varieties of broom were examined and the procedures for processing the samples to be analyzed were described. Subsequently, the results of the measurements of the following acoustic properties of the material were analyzed: air flow resistance, porosity, and sound absorption coefficient. Finally, the results of the numerical modeling of the acoustic coefficient were reported using an algorithm based on artificial neural networks. The results obtained are compared with a model based on linear regression.

https://doi.org/10.1016/j.apacoust.2020.107239

Case study: Automated recognition of wind farm sound using artificial neural networks

Wind energy has been one of the most widely used forms of energy since ancient times, with it being a widespread type of clean energy, which is available in mechanical form and can be efficiently transformed into electricity. However, wind turbines can be associated with concerns around noise pollution and visual impact. Modern turbines can generate more electrical power than older turbines even if they produce a comparable sound power level. Despite this, protests from citizens living in the vicinity of wind farms continue to be a problem for those institutions which issue permits. In this article, acoustic measurements carried out inside a house were used to create a model based on artificial neural networks for the automatic recognition of the noise emitted by the operating conditions of a wind farm. The high accuracy of the models obtained suggests the adoption of this tool for several applications.

https://doi.org/10.3397/1/376814

Wind turbine noise prediction using random forest regression

Wind energy is one of the most widely used renewable energy sources in the world and has grown rapidly in recent years. However, the wind towers generate a noise that is perceived as an annoyance by the population living near the wind farms. It is therefore important to new tools that can help wind farm builders and the administrations. In this study, the measurements of the noise emitted by a wind farm and the data recorded by the supervisory control and data acquisition (SCADA) system were used to construct a prediction model. First, acoustic measurements and control system data have been analyzed to characterize the phenomenon. An appropriate number of observations were then extracted, and these data were pre-processed. Subsequently two models of prediction of sound pressure levels were built at the receiver: a model based on multiple linear regression, and a model based on Random Forest algorithm. As predictors wind speeds measured near the wind turbines and the active power of the turbines were selected. Both data were measured by the SCADA system of wind turbines. The model based on the Random Forest algorithm showed high values of the Pearson correlation coefficient (0.981), indicating a high number of correct predictions. This model can be extremely useful, both for the receiver and for the wind farm manager. Through the results of the model it will be possible to establish for which wind speed values the noise produced by wind turbines become dominant. Furthermore, the predictive model can give an overview of the noise produced by the receiver from the system in different operating conditions. 

https://doi.org/10.3390/machines7040069

Fault diagnosis for UAV blades using artificial neural network

In recent years, unmanned aerial vehicles (UAVs) have been used in several fields including, for example, archaeology, cargo transport, conservation, healthcare, filmmaking, hobbies and recreational use. UAVs are aircraft characterized by the absence of a human pilot on board. The extensive use of these devices has highlighted maintenance problems with regard to the propellers, which represent the source of propulsion of the aircraft. A defect in the propellers of a drone can cause the aircraft to fall to the ground and its consequent destruction, and it also constitutes a safety problem for objects and people that are in the range of action of the aircraft. In this study, the measurements of the noise emitted by a UAV were used to build a classification model to detect unbalanced blades in a UAV propeller. To simulate the fault condition, two strips of paper tape were applied to the upper surface of a blade. The paper tape created a substantial modification of the aerodynamics of the blade, and this modification characterized the noise produced by the blade in its rotation. Then, a model based on artificial neural network algorithms was built to detect unbalanced blades in a UAV propeller. This model showed high accuracy (0.9763), indicating a high number of correct detections and suggests the adoption of this tool to verify the operating conditions of a UAV. The test must be performed indoors; from the measurements of the noise produced by the UAV it is possible to identify an imbalance in the propeller blade.

https://doi.org/10.3390/robotics8030059

Representation of the soundscape quality in urban areas through colours

Noise mapping is a useful and widespread method to visualise various items like the exposure to noise pollution, statistics of affected population, different noise source contribution analysis, and it is also a useful tool in designing noise-control plans. Some researches have moved a step further, proposing maps to represent the people perception of the acoustic environment. Most of these maps use colours as mere tools to display the spatial variability of acoustic parameters. In this paper the colours associated by interviewed people to different urban soundscapes have been analysed, and the possibility of using meaningful colours to represent the soundscape quality in noise mapping has been examined. https://doi.org/10.1515/noise-2019-0002

Noise Exposure of PC Video Games Players

Video games are a leisure activity that is being practiced by more and more people. Even the average age of the users is gradually increasing, representing a pleasant activity for any age. The literature has widely insinuated the doubt whether such widespread use could have negative consequences for the health of its users. This article describes noise exposure measurement activities for video game users. The damage caused by noise depends on both the acoustic power as well as the exposure time. For this reason, different noise exposure scenarios produced by video games have been simulated. The results of the study show that the daily level of noise exposure is close to the limits imposed by legislation, despite the hours of rest, and were performed in an environment with a low background noise (46.0 dBA).

Read the Paper

Heating, Ventilation, and Air Conditioning (HVAC) Noise Detection in Open-Plan Offices Using Recursive Partitioning

Open-plan offices have lower construction costs, allowing for significant savings in space and, according to designers, facilitate communication between workers, thus, improving collaboration, as well as the exchange of ideas. For these reasons, this type of office has become widespread, while highlighting numerous limitations and various problems. These include the control of anthropic and electromechanical noise. In this study, measurements of the noise emitted by a heating, ventilation, and air conditioning (HVAC) system were carried out in an open-plan office. The average spectral levels in a 1/3 octave band were compared through correlation analysis, to identify any redundant data. A model was then adapted to evaluate the importance of the variables, in order to classify the characteristics, by importance. To reduce the number of predictor variables, a selection analysis of the characteristics was carried out. A subset of predictors was extracted to be used to produce an accurate prediction model. Finally, a model based on recursive partitioning, to detect the operating conditions of an HVAC system, was developed and applied, so as to provide insights into the development and application of this technique, in these contexts. The high accuracy of the model (Accuracy= 0.9981) suggests the adoption of this tool for several applications.

Read the paper

Hands-On Reinforcement Learning with R

RLwithR

Reinforcement learning (RL) is an integral part of machine learning (ML), and is used to train algorithms. With this book, you’ll learn how to implement reinforcement learning with R, exploring practical examples such as using tabular Q-learning to control robots.

You’ll begin by learning the basic RL concepts, covering the agent-environment interface, Markov Decision Processes (MDPs), and policy gradient methods. You’ll then use R’s libraries to develop a model based on Markov chains. You will also learn how to solve a multi-armed bandit problem using various R packages. By applying dynamic programming and Monte Carlo methods, you will also find the best policy to make predictions. As you progress, you’ll use Temporal Difference (TD) learning for vehicle routing problem applications. Gradually, you’ll apply the concepts you’ve learned to real-world problems, including fraud detection in finance, and TD learning for planning activities in the healthcare sector. You’ll explore deep reinforcement learning using Keras, which uses the power of neural networks to increase RL’s potential. Finally, you’ll discover the scope of RL and explore the challenges in building and deploying machine learning models.

By the end of this book, you’ll be well-versed with RL and have the skills you need to efficiently implement it with R.

  • Understand how to use MDP to manage complex scenarios
  • Solve classic reinforcement learning problems such as the multi-armed bandit model
  • Use dynamic programming for optimal policy searching
  • Adopt Monte Carlo methods for prediction
  • Apply TD learning to search for the best path
  • Use tabular Q-learning to control robots
  • Handle environments using the OpenAI library to simulate real-world applications
  • Develop deep Q-learning algorithms to improve model performance

Hands-On Reinforcement Learning with R

The biggest Christmas sale

My Books on Python, Matlab and R at prices never seen before

5dollar-sm-2019_fb-linkedin

A fixture in the Packt calendar, the $5 campaign is something developers across the industry look forward to every year and 2019 isn’t looking to be any different. Packt Publishing always believed in making tech learning as accessible as possible, and by offering every eBook & video for $5 across the site throughout December they give developers the chance to grow their skills, indulge their curiosity and more for less.

You can buy my ebooks and videos for $ 5. These are the titles:

Implement key reinforcement learning algorithms and techniques using different R packages such as the Markov chain, MDP toolbox, contextual, and OpenAI Gym

Discover powerful ways to effectively solve real-world machine learning problems using key libraries including scikit-learn, TensorFlow, and PyTorch

Demonstrate fundamentals of Deep Learning and neural network methodologies using Keras 2.x

A practical guide to mastering reinforcement learning algorithms using Keras

Uncover the power of artificial neural networks by implementing them through R code.

Extract patterns and knowledge from your data in easy way using MATLAB

Build effective regression models in R to extract valuable insights from real data

Click here to get a preview of the available titles

Linear Decision Boundary of Logistic Regression

Now, we will study the concept of a decision boundary for a binary classification problem. We use synthetic data to create a clear example of how the decision boundary of logistic regression looks in comparison to the training samples. We start by generating two features, X1 and X2, at random. Since there are two features, we can say that the data for this problem are two-dimensional. This makes it easy to visualize. The concepts we illustrate here generalize to cases of more than two features, such as the real-world datasets you’re likely to see in your work; however, the decision boundary is harder to visualize in higher-dimensional spaces.

Perform the following steps:

  1. Generate the features using the following code:
np.random.seed(seed=6)
X_1_pos = np.random.uniform(low=1, high=7, size=(20,1))
print(X_1_pos[0:3])
X_1_neg = np.random.uniform(low=3, high=10, size=(20,1))
print(X_1_neg[0:3])
X_2_pos = np.random.uniform(low=1, high=7, size=(20,1))
print(X_1_pos[0:3])
X_2_neg = np.random.uniform(low=3, high=10, size=(20,1))
print(X_1_neg[0:3])

You don’t need to worry too much about why we selected the values we did; the plotting we do later should make it clear. Notice, however, that we are also going to assign the true class at the same time. The result of this is that we have 20 samples each in the positive and negative classes, for a total of 40 samples, and that we have two features for each sample. We show the first three values of each feature for both positive and negative classes.

The output should be the following:

IMG0

Generating synthetic data for a binary classification problem

  1. Plot these data, coloring the positive samples in red and the negative samples in blue. The plotting code is as follows:
plt.scatter(X_1_pos, X_2_pos, color='red', marker='x')
plt.scatter(X_1_neg, X_2_neg, color='blue', marker='x')
plt.xlabel('$X_1$')
plt.ylabel('$X_2$')
plt.legend(['Positive class', 'Negative class'])

The result should look like this:

 IMG1

Generating synthetic data for a binary classification problem

In order to use our synthetic features with scikit-learn, we need to assemble them into a matrix. We use NumPy’s block function for this to create a 40 by 2 matrix. There will be 40 rows because there are 40 total samples, and 2 columns because there are 2 features. We will arrange things so that the features for the positive samples come in the first 20 rows, and those for the negative samples after that.

  1. Create a 40 by 2 matrix and then show the shape and the first 3 rows:
X = np.block([[X_1_pos, X_2_pos], [X_1_neg, X_2_neg]])
print(X.shape)
print(X[0:3])

The output should be:

 IMG2

Combining synthetic features in to a matrix

We also need a response variable to go with these features. We know how we defined them, but we need an array of y values to let scikit-learn know.

  1. Create a vertical stack (vstack) of 20 1s and then 20 0s to match our arrangement of the features and reshape to the way that scikit-learn expects. Here is the code:
y = np.vstack((np.ones((20,1)), np.zeros((20,1)))).reshape(40,)
print(y[0:5])
print(y[-5:])

You will obtain the following output:

 IMG3

Create the response variable for the synthetic data

At this point, we are ready to fit a logistic regression model to these data with scikit-learn. We will use all of the data as training data and examine how well a linear model is able to fit the data.

  1. First, import the model class using the following code:
from sklearn.linear_model import LogisticRegression
  1. Now instantiate, indicating the liblinear solver, and show the model object using the following code:
example_lr = LogisticRegression(solver='liblinear')
example_lr

The output should be as follows:

 IMG3BIS

Fit a logistic regression model to the synthetic data in scikit-learn

  1. Now train the model on the synthetic data:
example_lr.fit(X, y)

How do the predictions from our fitted model look?

We first need to obtain these predictions, by using the trained model’s .predict method on the same samples we used for model training. Then, in order to add these predictions to the plot, using the color scheme of red = positive class and blue = negative class, we will create two lists of indices to use with the arrays, according to whether the prediction is 1 or 0. See whether you can understand how we’ve used a list comprehension, including an if statement, to accomplish this.

  1. Use this code to get predictions and separate them into indices of positive and negative class predictions. Show the indices of positive class predictions as a check:
y_pred = example_lr.predict(X)
positive_indices = [counter for counter in range(len(y_pred)) if y_pred[counter]==1]
negative_indices = [counter for counter in range(len(y_pred)) if y_pred[counter]==0]
positive_indices

The output should be:

IMG4

Positive class prediction indices

  1. Here is the plotting code:
plt.scatter(X_1_pos, X_2_pos, color='red', marker='x')
plt.scatter(X_1_neg, X_2_neg, color='blue', marker='x')
plt.scatter(X[positive_indices,0], X[positive_indices,1], s=150, marker='o',
edgecolors='red', facecolors='none')
plt.scatter(X[negative_indices,0], X[negative_indices,1], s=150, marker='o',
edgecolors='blue', facecolors='none')
plt.xlabel('$X_1$')
plt.ylabel('$X_2$')
plt.legend(['Positive class', 'Negative class', 'Positive predictions', 'Negative predictions'])

The plot should appear as follows:

 IMG5

Predictions and true classes plotted together

From the plot, it’s apparent that the classifier struggles with data points that are close to where you may imagine the linear decision boundary to be; some of these may end up on the wrong side of that boundary. Use this code to get the coefficients from the fitted model and print them:

theta_1 = example_lr.coef_[0][0]
theta_2 = example_lr.coef_[0][1]
print(theta_1, theta_2)

The output should look like this:

IMG6

Coefficients from the fitted model

  1. Use this code to get the intercept:
theta_0 = example_lr.intercept_

Now use the coefficients and intercept to define the linear decision boundary. This captures the dividing line of the inequality, X2 ≥ -(1/2)X1 – (0/2):

X_1_decision_boundary = np.array([0, 10])
X_2_decision_boundary = -(theta_1/theta_2)*X_1_decision_boundary - (theta_0/theta_2)

To summarize the last few steps, after using the .coef_ and .intercept_ methods to retrieve the model coefficients 12 and the intercept 0, we then used these to create a line defined by two points, according to the equation we described for the decision boundary.

  1. Plot the decision boundary using the following code, with some adjustments to assign the correct labels for the legend, and to move the legend to a location (loc) outside a plot that is getting crowded:
pos_true = plt.scatter(X_1_pos, X_2_pos, color='red', marker='x', label='Positive class')
neg_true = plt.scatter(X_1_neg, X_2_neg, color='blue', marker='x', label='Negative class')
pos_pred = plt.scatter(X[positive_indices,0], X[positive_indices,1], s=150, marker='o',
edgecolors='red', facecolors='none', label='Positive predictions')
neg_pred = plt.scatter(X[negative_indices,0], X[negative_indices,1], s=150, marker='o',
edgecolors='blue', facecolors='none', label='Negative predictions')
dec = plt.plot(X_1_decision_boundary, X_2_decision_boundary, 'k-', label='Decision boundary')
plt.xlabel('$X_1$')
plt.ylabel('$X_2$')
plt.legend(loc=[0.25, 1.05])

You will obtain the following plot:

 IMG7

True classes, predicted classes, and the decision boundary of a logistic regression

In this post, we discuss the basics of logistic regression along with various other methods for examining the relationship between features and a response variable.  To know, how to install the required packages to set up a data science coding environment, read the book Data Science Projects with Python on Packt Publishing.

Why is Logistic Regression Considered a Linear Model?

A model is considered linear if the transformation of features that is used to calculate the prediction is a linear combination of the features. The possibilities for a linear combination are that each feature can be multiplied by a numerical constant, these terms can be added together, and an additional constant can be added. For example, in a simple model with two features, X1 and X2, a linear combination would take the form:

FOR1

Linear combination of X1 and X2

The constants i, can be any number, positive, negative, or zero, for i = 0, 1, and 2 (although if a coefficient is 0, this removes a feature from the linear combination). A familiar example of a linear transformation of one variable is a straight line with the equation y = mx + b. In this case, o = b and 1 = mo is called the intercept of a linear combination, which should make sense when thinking about the equation of a straight line in slope-intercept form like this.

However, while these transformations are not part of the basic formulation of a linear combination, they could be added to a linear model by engineering features, for example defining a new feature, X3 = X12.

Predictions of logistic regression, which take the form of probabilities, are made using the sigmoid function. This function is clearly non-linear and is given by the following:

 FOR2

Non-linear sigmoid function

Why, then, is logistic regression considered a linear model? It turns out that the answer to this question lies in a different formulation of the sigmoid equation, called the logit function. We can derive the logic function by solving the sigmoid function for X; in other words, finding the inverse of the sigmoid function. First, we set the sigmoid equal to p, the probability of observing the positive class, then solve for X as shown in the following:

 FOR3

Solving for X

Here, we’ve used some laws of exponents and logs to solve for X. You may also see the logit expressed as:

 FOR4

Logit function

The probability of failure, q, is expressed in terms of the probability of successp: q = 1 – p, because probabilities sum to 1. Even though in our case, credit default would probably be considered a failure in the sense of real-world outcomes, the positive outcome (response variable = 1 in a binary problem) is conventionally considered “success” in mathematical terminology. The logit function is also called the log odds, because it is the natural logarithm of the odds ratiop/q. Odds ratios may be familiar from the world of gambling, via phrases such as “the odds are 2 to 1 that team a defeats team b.”

In general, what we’ve called capital X in these manipulations can stand for a linear combination of all the features. For example, this would be X = o + 1X1 + 2X2 in our simple case of two features. Logistic regression is considered a linear model because the features included in X are, in fact, only subject to a linear combination when the response variable is considered to be the log odds. This is an alternative way of formulating the problem, as compared to the sigmoid equation.

In summary, the features X1X2,…, Xj look like this in the sigmoid equation version of logistic regression:

 FOR5

Sigmoid version of logistic regression

But they look like this in the log odds version, which is why logistic regression is called a linear model:

 FOR6

Log odds version of logistic regression

Because of this way of looking at logistic regression, ideally the features of a logistic regression model would be linear in the log odds of the response variable.

This post is taken from the book Data Science Projects with Python by Packt Publishing written by Stephen Klosterman. The book explains descriptive analyses for future operations using predictive models.