GOVERNMENT ENGINEERING COLLEGETHRISSUR 2018M.
TECH SEMINAR REPORT ONMACHINE LEARNING APPROACHES OF LEAF RECOGNITIONPresented byNAYANA P B (Reg. No. TCR17ECCP13)DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING GOVERNMENT ENGINEERING COLLEGE, THRISSUR THRISSUR – 680 009ABSTRACTThe Kerala is famous for Ayurveda therapies. These medicines are basedon plant parts. The Ayurveda medicines are preferred for the hepatitis pa-tients and for fracture treatments etc.
So it is necessary to identify the plantsaccurately in order to make medicines.If the plant selected is not accuratethen it leades to poor quality of the drugs.One of the method is to recog-nize the leaf is visual exploration.But it may be inaccurate sometimes.
Thebotanist may have the ability to recognize the medicinal plants.But it is di-cult for common people.so there is need for automatic recognition of medicinalplants,which will mainly help common people.The ma jor steps of leaf recog-nition are feature extraction and classication.
There are dierent methods ofleaf classication,which include deep learning and machine learning.Some ofthe machine learning approaches are Nave bayes classier algorithm,Supportvector machine algorithm,Articial neural networks,Random forests and K-nearest neighbor algorithm.iACKNOWLEDGEMENTI express my indebtedness to the Almighty for, among many other things,the success of this seminar.I take this opportunity to place on record my heartfelt gratitude and thanksto Dr.
Tha judin Ahamed V I , Head of the Department of Electronicsand Communication Engineering, Govt. Engineering College, Thrissur for hisadvice, support and guidance throughout the seminar.I am deeply grateful to Mr.mohanan k.p, who supported us in numerousways and for her invaluable role in coordinating the seminar.I express my gratitude to all faculty members and supporting sta of thedepartment for the help and support given to me.Finally I would like to acknowledge my deep sense of gratitude to all wellwishers and friends who helped me directly or indirectly to complete this work.
iiContents1 INTRODUCTION 12 FEATURES OF LEAF IMAGE 23 MACHINE LEARNING 3 3.1 Concept . .
. . . . . .
. . . . .
. . . .
. . . . . . .
. . . . . . . 33.
2 ALGORITHMS . . .
. . .
. . . .
. . . . . . .
. . . . . . . .
. . 3 3.2.1 Nave Bayes Classier Algorithm .
. . . . .
. . . . .
. . . 33.
2.2 Support Vector Machine Learning Algorithm . . . . . .
. 53.2.3 Random Forest Machine Learning Algorithm . . . . .
. . 53.2.4 Articial neural networks . . .
. . . . .
. . . . . . . .
. . 73.
2.5 Multilayer perceptron . .
. . . . . .
. . .
. . . .
. . . . . 83.2.6 K nearest neighbor .
. . . .
. . . .
. . .
. . . . . .
. . . 83.2.7 Probablistic neural networks . . .
. . .
. . .
. . .
. . . . 94 CONCLUSION 10iiiList of Figures3.1 Architecture of ANN . . .
. . . . . . .
. . . . .
. . .
. . .
. . . 8ivList of abbreviationsANN Articial neural networksCNN Convolutional neural networksKNN K Nearest neighborMLP Multilayer perceptronOSH Optimal Separating Hyper PlanePNN Probabilistic neural networkSVM Support vector machinevChapter 1INTRODUCTIONAyurvedic medicines are based on herbs.Correct identication leads to qual-ity assurance of ayurvedic drugs. Each plant is dierent from other plants bytheir unique properties.
Depending on their shape,colour,texture we can iden-tify the plants uniquely.1Chapter 2FEATURES OF LEAF IMAGE2Chapter 3MACHINE LEARNING3.1 Concept Machine learning applications are highly automated and self-modifying whichcontinue to improve over time with minimal human intervention as they learnwith more data.To address the complex nature of various real world data prob-lems, specialized machine learning algorithms have been developed that solvethese problems perfectly.Machine Learning algorithms are classied asˆSupervised Machine Learning Algorithms: Machine learning algorithmsthat make predictions on given set of samples.
Supervised machine learn-ing algorithm searches for patterns within the value labels assigned todata points.ˆ Unsupervised Machine Learning Algorithms: There are no labels associ-ated with data points. These machine learning algorithms organize thedata into a group of clusters to describe its structure and make complexdata look simple and organized for analysis.ˆ Reinforcement Machine Learning Algorithms These algorithms choose anaction, based on each data point and later learn how good the decisionwas. Over time, the algorithm changes its strategy to learn better andachieve the best reward.3.2 ALGORITHMS The Common Machine Learning Algorithms are Nave Bayes Classier Al-gorithm,Algorithm,Support Vector Machine Algorithm,Articial Neural Net-works,Random Forests and k Nearest Neighbours.
3.2.1 Nave Bayes Classier Algorithm It would be dicult and practically impossible to classify a web page, adocument, an email or any other lengthy text notes manually.
This is whereNave Bayes Classier machine learning algorithm comes to the rescue. A3M.TECH SEMINAR, 2018classier is a function that allocates a populations element value from one ofthe available categories. For instance, Spam Filtering is a popular applicationof Nave Bayes algorithm.
Spam lter here, is a classier that assigns a labelSpam or Not Spam to all the emails. Nave Bayes Classier is amongst the most popular learning method groupedby similarities, that works on the popular Bayes Theorem of Probability- tobuild machine learning models particularly for disease prediction and documentclassication. It is a simple classication of words based on Bayes ProbabilityTheorem for sub jective analysis of content.When to use the Machine Learning algorithm – Nave Bayes Classi-er?ˆIf you have a moderate or large training data set.
ˆ If the instances have several attributes.ˆ Given the classication parameter, attributes which describe the in-stances should be conditionally independent.Applications of Nave Bayes Classier ˆSentiment Analysis: It is used at Facebook to analyse status updatesexpressing positive or negative emotions.ˆ Document Categorization: Google uses document classication to in-dex documents and nd relevancy scores i.e.
the PageRank. PageRankmechanism considers the pages marked as important in the databasesthat were parsed and classied using a document classication techniqueˆ Nave Bayes Algorithm is also used for classifying news articles aboutTechnology, Entertainment, Sports, Politics, etc.ˆ Email Spam Filtering-Google Mail uses Nave Bayes algorithm to classifyyour emails as Spam or Not Spam.Advantages of the Nave Bayes Classier Machine Learning Algo-rithmˆNave Bayes Classier algorithm performs well when the input variablesare categorical.ˆ A Nave Bayes classier converges faster, requiring relatively little train-ing data than other discriminative models like logistic regression, whenthe Nave Bayes conditional independence assumption holds.ˆ With Nave Bayes Classier algorithm, it is easier to predict class of thetest data set. A good bet for multi class predictions as well.Department of ECE, GEC, Thrissur 4M.
TECH SEMINAR, 2018ˆ Though it requires conditional independence assumption, Nave BayesClassier has presented good performance in various application do-mains.3.2.2 Support Vector Machine Learning Algorithm Support Vector Machine is a supervised machine learning algorithm for clas-sication or regression problems where the dataset teaches SVM about theclasses so that SVM can classify any new data. It works by classifying thedata into dierent classes by nding a line (hyperplane) which separates thetraining data set into classes. As there are many such linear hyperplanes, SVMalgorithm tries to maximize the distance between the various classes that areinvolved and this is referred as margin maximization. If the line that maximizesthe distance between the classes is identied, the probability to generalize wellto unseen data is increased.
SVMs are classied into two categories:ˆLinear SVMs : In linear SVMs the training data i.e. classiers are sepa-rated by a hyperplane.
ˆ Non-Linear SVMs : In non-linear SVMs it is not possible to separate thetraining data using a hyperplane.Advantages of Using SVM ˆSVM oers best classication performance (accuracy) on the trainingdata.ˆ SVM renders more eciency for correct classication of the future data.ˆ The best thing about SVM is that it does not make any strong assump-tions on data.
ˆ It does not over-t the data.Applications of Support Vector Machine SVM is commonly used for stock market forecasting by various nancialinstitutions. For instance, it can be used to compare the relative performanceof the stocks when compared to performance of other stocks in the same sector.
The relative comparison of stocks helps manage investment making decisionsbased on the classications made by the SVM learning algorithm.3.2.3 Random Forest Machine Learning Algorithm Random Forest is the go to machine learning algorithm that uses a baggingapproach to create a bunch of decision trees with random subset of the data.A model is trained several times on random sample of the dataset to achievegood prediction performance from the random forest algorithm.
In this ensem-ble learning method, the output of all the decision trees in the random forest,Department of ECE, GEC, Thrissur 5M.TECH SEMINAR, 2018is combined to make the nal prediction. The nal prediction of the randomforest algorithm is derived by polling the results of each decision tree or justby going with a prediction that appears the most times in the decision trees.Why use Random Forest Machine Learning Algorithm? ˆThere are many good open source, free implementations of the algorithmavailable in Python and R.
ˆ It maintains accuracy when there is missing data and is also resistant tooutliers.ˆ Simple to use as the basic random forest algorithm can be implementedwith just a few lines of code.ˆ Random Forest machine learning algorithms help data scientists savedata preparation time, as they do not require any input preparationand are capable of handling numerical, binary and categorical features,without scaling, transformation or modication.ˆ Implicit feature selection as it gives estimates on what variables are im-portant in the classication.Advantages of Using Random Forest Machine Learning Algorithms ˆOvertting is less of an issue with Random Forests, unlike decision treemachine learning algorithms. There is no need of pruning the randomforest.ˆ These algorithms are fast but not in all cases.
A random forest algorithm,when run on an 800 MHz machine with a dataset of 100 variables and50,000 cases produced 100 decision trees in 11 minutes.ˆ Random Forest is one of the most eective and versatile machine learningalgorithm for wide variety of classication and regression tasks, as theyare more robust to noise.ˆ It is dicult to build a bad random forest. In the implementation ofRandom Forest Machine Learning algorithms, it is easy to determinewhich parameters to use because they are not sensitive to the parametersthat are used to run the algorithm.
One can easily build a decent modelwithout much tuning.ˆ Random Forest machine learning algorithms can be grown in parallel.ˆ This algorithm runs eciently on large databases.
ˆ Has higher classication accuracy.Department of ECE, GEC, Thrissur 6M.TECH SEMINAR, 2018Drawbacks of Using Random Forest Machine Learning Algorithms ˆThey might be easy to use but analysing them theoretically, is dicult.Large number of decision trees in the random forest can slow down thealgorithm in making real-time predictions.ˆ If the data consists of categorical variables with dierent number of levels,then the algorithm gets biased in favour of those attributes that havemore levels. In such situations, variable importance scores do not seemto be reliable.ˆ When using RandomForest algorithm for regression tasks, it does notpredict beyond the range of the response values in the training data.
Applications of Random Forest Machine Learning Algorithms ˆRandom Forest algorithms are used by banks to predict if a loan applicantis a likely high risk.ˆ They are used in the automobile industry to predict the failure or break-down of a mechanical part.ˆ These algorithms are used in the healthcare industry to predict if a pa-tient is likely to develop a chronic disease or not.ˆ They can also be used for regression tasks like predicting the averagenumber of social media shares and performance scores.ˆ Recently, the algorithm has also made way into predicting patterns inspeech recognition software and classifying images and texts.3.2.
4 Articial neural networks An ANN is based on a collection of connected units or nodes called articialneurons . Each connection between articial neurons can transmit a signalfrom one to another. The articial neuron that receives the signal can pro-cess it and then signal articial neurons connected to it. In common ANNimplementations, the signal at a connection between articial neurons is a realnumber, and the output of each articial neuron is calculated by a non-linearfunction of the sum of its inputs. Articial neurons and connections typicallyhave a weight that adjusts as learning proceeds. The weight increases or de-creases the strength of the signal at a connection. Articial neurons may havea threshold such that only if the aggregate signal crosses that threshold isthe signal sent.
Typically, articial neurons are organized in layers. Dierentlayers may perform dierent kinds of transformations on their inputs. Signalstravel from the rst (input), to the last (output) layer, possibly after traversingthe layers multiple times.
Department of ECE, GEC, Thrissur 7M.TECH SEMINAR, 2018The phase of operationsof ANN are testing phase and training phase.Inputimages are trained with the ANN in the training phase and in testing phasethe the image is detected which is more closer to the trained image. Figure 3.1: Architecture of ANNThe feed forward back propagation neural network is shown in the gure.Afeedforward neural network is an articial neural network wherein connectionsbetween the units do not form a cycle.Backpropagation is a method used inarticial neural networks to calculate a gradient that is needed in the calcu-lation of the weights to be used in the network.It is commonly used to traindeep neural networks, a term used to explain neural networks with more thanone hidden layer.
The O1,O2,….Om represents the output vector,which is basi-cally the plant class.F1,F2.
…….Fn represents the input vector,which consist offeatures of image.
As the number of hidden layers increases then the accuracyalso increases and complexity of the system decreases.3.2.5 Multilayer perceptron A multilayer perceptron (MLP) is a class of feedforward articial neural net-work. An MLP consists of at least three layers of nodes. Except for the inputnodes, each node is a neuron that uses a nonlinear activation function.
MLPutilizes a supervised learning technique called backpropagation for training.Its multiple layers and non-linear activation distinguish MLP from a linearperceptron. It can distinguish data that is not linearly separable. The MLPconsists of three or more layers (an input and an output layer with one ormore hidden layers) of nonlinearly-activating nodes making it a deep neuralnetwork. Since MLPs are fully connected, each node in one layer connects witha certain weight to every node in the following layer.3.2.
6 K nearest neighbor in KNN the ob jects are classied based on the similarity between the trainingimage and testing image.Training stage include thefeature extraction,storingfeature vectors and labelling the training images.The neighbors are the nearby pixels.Unlabelled points are taken as it’s neighbors.According to the labelsDepartment of ECE, GEC, Thrissur 8M.TECH SEMINAR, 2018of k nearest neighbors ob jects are classied.k-NN is a type of instance-basedlearning, or lazy learning, where the function is only approximated locally andall computation is deferred until classication. The k-NN algorithm is amongthe simplest of all machine learning algorithms.
3.2.7 Probablistic neural networksPNN is also a feed forward neural network.When the input is applied then therst layer calculate the distance between input vector of image and trainingimage vector.The second layer perform summation of each class of input andproduces it’s net output as a vector of probabilities.
Department of ECE, GEC, Thrissur 9Chapter 4CONCLUSIONLeaf recognition composed of mainly two steps, feature extraction and classi-cation.There are a number of classication algorithms which is used for imagerecognition.we can use SVM ,KNN,Or ANN etc.the KNN is more suitable formedicinal plant recognition.Because it is mainly used for multiclass classica-tion. As the number of features are increased the eciency of leaf recognitionsystem is also increases.
The morphological parameters are mainly used forthe recognition of leaf.If the number features used to recognize the plant isincreases then the eciency of the system also increases.10Bibliography1 Amala Sabu & Sreekumar K. (2017). Literature review of image features and classiers used in leaf based plant recognition through image anal-ysis approach.
Inventive Communication and Computational Technolo-gies,IEEE2 D Venkataraman & Mangayarkarasi N(2016). Computer Vision Based Feature Extraction of Leaves for Identication of Medicinal Values ofPlants.In Computational Intelligence and Computing Research. IEEE.3 T. Sathwik et al. Classication of Selected Medicinal Plant Leaves Using Texture Analysis. In computing,communication and network technologies.
IEEE.4 Ankur Gupta, Dr. B.
S. Rai. (May 2014).Recognition of plants by leaf image using nearest neighborhood classication . In International JournalFor Technological Research In Engineering , Volume 1, Issue 9.5 E. Sandeep Kumar and Viswanath Talasila.
( April 2014).Leaf Features based approach for Automated Identication of Medicinal Plants . Com-munication and Signal Processing, IEEE.6 Adams Begue, Venitha Kowlessur et al. (2017, September).Automatic Recognition of Medicinal Plants using Machine Learning Techniques.
In(IJACSA) International Journal of Advanced Computer Science and Ap-plications, Vol. 8, No. 4.7 https://www.dezyre.com/article/top-10-machine-learning- algorithms/20211