What are Support Vector Machines?

Download PDF Copy

By Ashutosh RoyReviewed by Susha Cheriyedath, M.Sc.

The landscape of various industries has been transformed by machine learning algorithms, empowering researchers and professionals to solve complex problems efficiently. Among these algorithms, Support Vector Machines (SVM) stand out as a robust and versatile tool, especially in classification tasks. From personalized medicine to cultural heritage conservation, SVM's impact is far-reaching and transformative. This article explores the fundamentals of SVM, its role in cancer genomics, and its applications in historic building preservation.

Understanding SVMs

To comprehend the prowess of SVMs, let's begin with its core principles. SVM is a discriminant machine learning technique, seeking an optimal hyperplane in a feature space to separate data points belonging to different classes. Unlike generative approaches that rely on probability distributions, SVM is purely discriminative, making it efficient and resource-friendly.

One of SVM's key properties is its inherent sparsity, which means it relies on a subset of training data points (support vectors) to build the classifier. This characteristic leads to faster prediction times and efficient memory usage, making SVM suitable for large datasets.

Another vital aspect of SVM is the kernel technique. When the data is not linearly separable in the original input space, SVM can map it into a higher-dimensional space using kernel functions. This transformation allows for the use of a linear separator in the transformed feature space, effectively handling non-linear relationships between data points.

Furthermore, SVM's strength lies in finding the maximum margin separator, where the hyperplane is positioned to maximize the distance between the support vectors of different classes. This approach enhances the model's generalization ability, leading to better performance on unseen data.

Depending on the linearity of the data, SVM can be formulated in two main ways: hard-margin SVM and soft-margin SVM. Hard-margin SVM is suitable for linearly separable data, aiming to find a hyperplane that perfectly classifies all training data. However, real-world data is often not perfectly separable. Soft-margin SVM allows for some misclassifications, introducing a regularization parameter (C) to control the trade-off between margin width and misclassification.

Kernel SVMs and Cancer Genomics

Cancer is a complex disease with diverse subtypes, each requiring tailored treatments for better patient outcomes. This is where SVM has emerged as a game-changer in personalized medicine and cancer genomics.

Cancer Classification

Traditional SVM is powerful when the data is linearly separable, but many cancer datasets exhibit non-linear relationships. Kernel SVM addresses this limitation by transforming the data into a higher-dimensional space where a linear separator can be applied effectively. By using kernel functions like polynomial, Gaussian Radial Basis Function (RBF), and Laplacian RBF, SVM can efficiently classify different cancer types based on gene expression data.

Cancer subtyping is crucial for understanding the unique characteristics of each subtype and guiding treatment decisions. SVM has shown remarkable efficiency in handling high-dimensional data with limited sample sizes, making it an essential tool in identifying cancer subtypes and their distinct molecular profiles.

*Image credit: NicoElNino / Shutterstock*

Biomarker Discovery with SVM

Biomarkers play a crucial role in cancer diagnosis and prognosis. SVM's feature selection capabilities, such as SVM Recursive Feature Elimination (SVM-RFE), have gained popularity in identifying relevant biomarkers from vast genomics datasets. By analyzing the relationship between these biomarkers and cancer outcomes, SVM enables the development of robust cancer classifiers, paving the way for personalized treatment plans and improved patient care.

Drug Discovery for Cancer Therapy

Discovering effective cancer drugs is a challenging process. Kernel SVM has played a significant role in the virtual screening and prioritization of potential anticancer drugs. By integrating genomic features of cancer cells, SVM models can predict drug efficacy and assist in selecting active compounds with the highest likelihood of success. Additionally, SVM has been successfully applied in predicting protein-ligand binding affinities, helping researchers identify novel drug targets and optimize drug design for improved therapeutic outcomes.

Graph-Embedded Subspace Support Vector Data Description

One-class classification, or anomaly detection, is crucial in various industries, including manufacturing, finance, and cybersecurity. Traditional one-class classifiers often operate in the original feature space and may suffer from the curse of dimensionality.

Graph-Embedded Subspace Learning: GESSVDD presents a novel subspace learning framework for one-class classification, using graph embedding to address the problem. By representing the data in the form of graphs, the framework allows for the incorporation of other optimization goals and insights into existing subspace one-class techniques. This leads to a more generalized and efficient solution.

Spectral Solutions and Spectral Regression: GESSVDD introduces spectral and spectral regression-based solutions as alternatives to traditional gradient-based techniques. Spectral regression efficiently tackles the eigen-decomposition step, which can be cumbersome for large-scale datasets, making GESSVDD more scalable and practical.

Graph Selection and Optimization: One of the key advantages of GESSVDD is the ability to choose different graphs (Lx) to enforce specific constraints and data relationships. This flexibility allows the model to adapt to diverse datasets, improving performance across various one-class classification tasks.

Non-Linear Data Description: GESSVDD also addresses the need for non-linear data descriptions by employing the non-linear projection trick (NPT). NPT enables the use of kernel tricks, facilitating non-linear mappings even with linear variants of the method, enhancing the model's accuracy in capturing complex relationships.

Moisture Content Assessment in Historic Buildings

Historic buildings hold immense cultural significance, representing the rich legacy of past civilizations and architectural brilliance. Preserving these architectural marvels for future generations is a shared responsibility. However, one of the key challenges lies in maintaining the structural integrity of these historical structures, particularly when dealing with issues such as excessive moisture content in brick walls.

Efforts to preserve historic buildings require accurate assessments of moisture content without causing any damage to the delicate structures. Traditionally, destructive methods like the gravimetric method have been used for moisture content assessment, but these intrusive techniques raise conservational concerns. To overcome this limitation, non-destructive methods, including chemical indicator paper and thermal methods, have been utilized to provide qualitative assessments. However, these methods lack the precision needed for quantifying moisture content accurately.

With the advancements in artificial intelligence, SVM algorithms have emerged as a promising tool for building preservation and non-destructive testing. SVM's ability to handle high-dimensional data and its capability to classify data points efficiently make it well suited for moisture content assessment in historic buildings.

The Hybrid Approach

A novel hybrid approach combines SVM algorithms with two non-destructive research methods: dielectric and microwave. By training the models with a large and representative dataset, the SVM-based algorithms can accurately assess the mass moisture content of brick walls in historic buildings.

Experimental Research and Results

To validate the hybrid approach, the researchers collected data from 10 historical buildings representing different periods of history. The dataset included 290 sample sets, with seven predictor variables encompassing wall moisture (XD, XM, and Um) and salt concentration (XC, XS, XA).

Statistical analysis of the experimental data indicated the representativeness of the dataset, strengthening the credibility of the results. The SVM algorithm demonstrated the highest predisposition for non-destructive moisture content assessment, with an R2 value of 0.968. Additionally, the algorithm exhibited the lowest RMSE, MAE, and MAPE values, further affirming its accuracy and reliability.

Advantages of the Hybrid Approach

Non-Destructive Assessment: The hybrid approach eliminates the need for destructive methods, preserving the integrity of historic buildings while gathering critical moisture content data.

Precision and Reliability: The combination of SVM algorithms and non-destructive methods provides precise and reliable results, essential for understanding the moisture levels in these delicate structures.

Cost-effectiveness: Compared to nuclear methods, the proposed approach is more cost-effective, making it feasible for in situ testing in historic buildings.

References

1. Huang, S., Cai, N., Pacheco, P. P., Narrandes, S., Wang, Y., & Xu, W. (2018). Applications of Support Vector Machine (SVM) Learning in Cancer Genomics. Cancer Genomics & Proteomics, 15(1), 41–51. DOI: http://dx.doi.org/10.21873/cgp.20063

2. Were, K., Bui, D. T., Dick, Ø. B., & Singh, B. R. (2015). A comparative assessment of support vector regression, artificial neural networks, and random forests for predicting and mapping soil organic carbon stocks across an Afromontane landscape. Ecological Indicators, 52, 394–403. DOI: https://doi.org/10.1016/j.ecolind.2014.12.028

3. Sohrab, F., Iosifidis, A., Gabbouj, M., & Raitoharju, J. (2023). Graph-embedded subspace support vector data description. Pattern Recognition, 133, 108999. DOI: https://doi.org/10.1016/j.patcog.2022.108999

4. Hoła, A., & Czarnecki, S. (2023). Random forest algorithm and support vector machine for nondestructive assessment of mass moisture content of brick walls in historic buildings. Automation in Construction, 149, 104793. DOI: https://doi.org/10.1016/j.autcon.2023.104793

5. Awad, M., & Khanna, R. (2015). Support Vector Machines for Classification. Efficient Learning Machines, 39–66. DOI: https://doi.org/10.1007/978-1-4302-5990-9_3

Last Updated: Jul 24, 2023

Written by

Ashutosh Roy

Ashutosh Roy has an MTech in Control Systems from IIEST Shibpur. He holds a keen interest in the field of smart instrumentation and has actively participated in the International Conferences on Smart Instrumentation. During his academic journey, Ashutosh undertook a significant research project focused on smart nonlinear controller design. His work involved utilizing advanced techniques such as backstepping and adaptive neural networks. By combining these methods, he aimed to develop intelligent control systems capable of efficiently adapting to non-linear dynamics.

Download PDF Copy

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

APA
Roy, Ashutosh. (2023, July 24). What are Support Vector Machines?. AZoAi. Retrieved on July 13, 2025 from https://www.azoai.com/article/What-are-Support-Vector-Machines.aspx.
MLA
Roy, Ashutosh. "What are Support Vector Machines?". AZoAi. 13 July 2025. <https://www.azoai.com/article/What-are-Support-Vector-Machines.aspx>.
Chicago
Roy, Ashutosh. "What are Support Vector Machines?". AZoAi. https://www.azoai.com/article/What-are-Support-Vector-Machines.aspx. (accessed July 13, 2025).
Harvard
Roy, Ashutosh. 2023. What are Support Vector Machines?. AZoAi, viewed 13 July 2025, https://www.azoai.com/article/What-are-Support-Vector-Machines.aspx.