A Photomicrographic Dataset of Rocks for the Accurate Classification of Minerals

Background & Summary

Geological Context and Motivation

Accurate mineral identification in igneous and metamorphic rocks serves as the foundation for critical applications spanning industrial resource assessment to fundamental geological interpretation. The economic significance of these determinations is substantial: quartz identification directly impacts the $1 trillion semiconductor industry[1](https://www.nature.com/articles/s41597-025-05879-9#ref-CR1 “2025 semiconductor industry outlook | Deloitte Insights. https://www.deloitte.com/us/en/insights/industry/technology/technology-media-telecom-outlooks/semiconductor-industry-outlook.html

(Accessed: 24th June 2025).“),[2](https://www.nature.com/articles/s41597-025-05879-9#ref-CR2 “Semiconductors have a big opportunity—but barrie…

Background & Summary

Geological Context and Motivation

(Accessed: 24th June 2025).“),[2](https://www.nature.com/articles/s41597-025-05879-9#ref-CR2 “Semiconductors have a big opportunity—but barriers to scale remain | McKinsey. https://www.mckinsey.com/industries/semiconductors/our-insights/semiconductors-have-a-big-opportunity-but-barriers-to-scale-remain

(Accessed: 24th June 2025).“), where high-purity SiO₂ is irreplaceable for silicon wafer production[3](https://www.nature.com/articles/s41597-025-05879-9#ref-CR3 “High Purity Quartz Market Size & Growth | Industry Analysis 2031. https://www.transparencymarketresearch.com/high-purity-quartz-market.html

(Accessed: 24th June 2025).“). Feldspar classification, meanwhile, guides ceramic manufacturing processes valued at $1.93 billion annually[4](https://www.nature.com/articles/s41597-025-05879-9#ref-CR4 “Global Feldspar Market Size, Trends, Share, Forecast 2033. https://www.custommarketinsights.com/report/feldspar-market/

(Accessed: 25th June 2025).“), with its fluxing properties critical for 79% of global tile and sanitaryware production5,[6](https://www.nature.com/articles/s41597-025-05879-9#ref-CR6 “Feldspar Market Global Forecast to 2022 | MarketsandMarkets. https://www.marketsandmarkets.com/Market-Reports/feldspar-market-201399009.html

(Accessed: 25th June 2025).“). Traditional petrographic analysis relies heavily on point counting methodology, where geologists manually identify and quantify mineral phases in thin sections under polarized light microscopy. However, this approach exhibits significant limitations, with inter-operator variability ranging from 15–30% due to subjective interpretation of complex optical properties, including pleochroism (biotite displaying characteristic yellow to dark brown color variations[7](https://www.nature.com/articles/s41597-025-05879-9#ref-CR7 “Biotite. https://www.science.smith.edu/geosciences/petrology/petrography/biotite/biotite.html

(Accessed: 25th June 2025).“)) and birefringence measurements (quartz exhibiting 0.009 birefringence values[8](https://www.nature.com/articles/s41597-025-05879-9#ref-CR8 “Quartz. https://www.science.smith.edu/geosciences/petrology/petrography/quartz/quartz.html

(Accessed: 25th June 2025).“)).Methodological biases, such as inconsistent classification rules (e.g., Gazzi-Dickinson vs. Indiana methods9), further compound these errors, undermining reproducibility in mineralogical studies10.

The temporal constraints of conventional optical microscopy further compound these challenges. A comprehensive thin-section analysis typically requires 2–4 hours of expert examination, contrasting sharply with the minutes required for automated identification systems11. While optical microscopy remains the foundational technique for petrographic analysis[12](https://www.nature.com/articles/s41597-025-05879-9#ref-CR12 “Correcting Focus Drift in Live-Cell Microscopy | Nikon’s MicroscopyU. https://www.microscopyu.com/applications/live-cell-imaging/correcting-focus-drift-in-live-cell-microscopy

(Accessed: 25th June 2025).“), its inherent scalability limitations create bottlenecks in large-scale geological surveys and industrial applications where rapid, consistent mineral identification is essential for decision-making processes13,14.

Datasets Limitations

Limitations of the not publicly available datasets

In recent years, the application of machine learning (ML) techniques to classify rocks and minerals using microscopic and spectral data has achieved notable success. Various studies have utilized datasets tailored to specific mineral types or geological regions to develop and evaluate ML models (see below). By examining these studies, particularly focusing on the characteristics of the datasets employed, a comparative analysis reveals the distinct advantages of the proposed Menoufia University Machine Learning Dataset for Minerals Classification 2025 (MUMDMC2025). This dataset stands out for its comprehensive and diverse collection of rock images, enhanced annotation quality, and balanced representation of mineral classes. These features address key limitations found in many existing datasets, such as restricted accessibility, inconsistent image acquisition parameters, and class imbalance, thereby advancing the field and enabling more robust and generalizable ML-based mineral classification.

The work in15 conducted mineral identification using color spaces and artificial neural networks. They analyzed 22 images of five minerals, capturing images at various angles with polarized light, but the dataset was not accessible. Their neural network achieved a success rate of 81–98% based on unseen samples.

The work in16 aimed at automating rock sample classification using various pattern recognition methods. Their study included 2,700 microscopic images of nine minerals with 1280 × 960 pixels image dimensions; however, the limitation is that the dataset is not obtainable.

The work in17 developed an ensemble machine learning model based on the Inception-v3 architecture for rock-mineral microscopic images. They utilized a dataset of 481 images for training and evaluation, but did not disclose the dataset’s availability or other characteristics.

The work in18 worked on automated mineral classification using KNN and DT models, analyzing images captured at multiple angles with polarized light. Although their approach achieved over 90% accuracy, the dataset specifics were not provided.

The work in19 employed a concatenated convolutional neural network for classifying thin section images from 92 rock samples, resulting in an average accuracy of 89.97%. The study processed 2,208 images sliced into smaller patches, but again, the dataset was not publicly available.

The work in20 utilized convolutional neural networks to classify six types of igneous rocks from petrographic thin section images. They employed ResNet152 and VGG19BN models, processing 352 original images that were augmented through flipping and rotating. While the dataset was not publicly available, images were taken under specific polarized light conditions.

The work in21 explored deep learning for intelligent lithology identification, using a dataset of 14,950 rock microscopic images from various rock types. This study highlighted the superior performance of the Xception model, but the dataset was not available to other researchers.

A common limitation among the reviewed studies is the restricted accessibility of their datasets. This hinders reproducibility and independent verification of the previous methods. Additionally, many studies lack detailed information about image acquisition parameters, such as magnification, resolution, and polarization conditions. This inconsistency makes it challenging to compare results across different studies and to assess the generalizability of the previous techniques. Furthermore, most studies focus on specific mineral or rock types, limiting the applicability of their findings to a broader range of geological samples.

Limitations of the publicly available datasets

Current mineral image datasets suffer from three fundamental constraints that severely limit their utility for robust machine learning applications. Scale deficiency represents the most critical limitation in current datasets, exemplified by the Igneous and Metamorphic Dataset22, which contains merely 92 accessible images (from 200 originally reported) with class distributions of ≤34 samples per mineral category. This sample size falls far below the statistical requirements for reliable machine learning model training and validation, particularly for complex classification tasks involving subtle optical property distinctions (e.g., pleochroism in biotite or 0.009 birefringence in quartz)23,24. Small sample sizes exacerbate overfitting risks and fail to capture geological variability, as demonstrated in hyperspectral23 and thin-section analyses25.

Optical incompleteness constitutes the second major limitation, as most existing collections capture ≤ 5 rotation angles per specimen. The GEO Dataset exemplifies this constraint, omitting the comprehensive interference patterns observable across complete crystallographic orientations that are essential for accurate mineral identification. This limited angular sampling fails to document critical optical phenomena, including extinction angle variations and complete pleochroism sequences that define mineral species18.

Metadata deficiencies represent the third critical constraint, with systematic reviews revealing that a substantial proportion of existing mineral image datasets lack essential acquisition parameters (e.g., magnification settings, polarization modes, and imaging conditions)26,27. This documentation gap severely hampers reproducibility, as seen in inconsistent birefringence measurements28 and prevents meaningful comparison between datasets or validation of methodological approaches29,30. The cumulative effect of these limitations manifests in poor machine learning performance, with existing datasets achieving classification accuracies ≤ 52% (see below).

This study introduces the MUMDMC2025 dataset, a comprehensive collection of mineral and rock images with detailed metadata, including mineral type, acquisition parameters, and labelling annotations. Although proprietary, this dataset facilitates collaboration and reproducibility within our research group, paving the way for enhanced mineral and rock classification methodologies.

MUMDMC2025 Contributions

Importance of the Selected Minerals Identification and Rock Classification

The automation of mineral identification for Biotite, Hornblende, Plagioclase, Potassium-Feldspar, and Quartz is critical due to their ubiquity in igneous and metamorphic rocks, economic significance in mining (e.g., Quartz in silicon production, Feldspars in ceramics), and role in geological process interpretation31,32. Traditional methods like point counting—a manual technique where minerals are quantified via thin-section analysis under polarized light—remain foundational but face limitations in scalability and subjectivity, as noted in studies comparing manual counts with automated mineralogy (e.g., X-ray diffraction or SEM-based systems)32,[33](https://www.nature.com/articles/s41597-025-05879-9#ref-CR33 “About Automated Mineralogy. https://www.portaspecs.com/about-automated-mineralogy/

(Accessed: 16th June 2025).“). For instance, point counting’s labor-intensive nature and inter-operator variability are well-documented in rock texture analysis34,35, while recent advancements in Laser-Induced Breakdown Spectroscopy (LIBS) mapping36 and machine learning (e.g., Decision Trees (DT) and K-Nearest Neighbors (KNN) for classification of thin sections35,[37](https://www.nature.com/articles/s41597-025-05879-9#ref-CR37 “K-Nearest Neighbors (KNN) Classification with scikit-learn | DataCamp. https://www.datacamp.com/tutorial/k-nearest-neighbor-classification-scikit-learn

(Accessed: 16th June 2025).“),38) demonstrate superior efficiency and reproducibility. Thus, automating the identification of these five minerals aligns with industry demands for rapid, accurate resource assessment and reduced human bias, as underscored by applications in sustainable mining and exploration31,39.

Properties of the selected minerals

The developed dataset comprises five types of minerals which represent the rock-forming minerals of various rock types of various rock types. Each type has a set of distinct properties, which are listed as follows:

Biotite K(Mg, Fe)₃(AlSi₃O₁₀)(OH, F)₂: Strong pleochroism (yellow to dark brown), high birefringence (0.04–0.08), and perfect basal cleavage. Under crossed polarizers, it exhibits vivid interference colors (2nd–3rd order), commonly found in granites and metamorphic schists[40](#ref-CR40 “Home - Handbook of Mineralogy. https://handbookofmineralogy.org/

(Accessed: 17th June 2025).“),41,[42](https://www.nature.com/articles/s41597-025-05879-9#ref-CR42 “Optical Properties of Minerals » Geology Science. https://geologyscience.com/geology/optical-properties-of-minerals/

(Accessed: 17th June 2025).“).

Hornblende Ca2(Mg, Fe2+, Al)5(Si, Al)8 O22****(OH)2:** Green to brown pleochroism, moderate birefringence (0.014–0.018), and inclined extinction (15°–25°). Displays amphibole cleavage (60°/120°), Key in amphibolites and andesites[40](#ref-CR40 “Home - Handbook of Mineralogy. https://handbookofmineralogy.org/

(Accessed: 17th June 2025).“),41,[42](#ref-CR42 “Optical Properties of Minerals » Geology Science. https://geologyscience.com/geology/optical-properties-of-minerals/

(Accessed: 17th June 2025).“),43.

Plagioclase ((Na, Ca)AlSi₃O₈): A feldspar group mineral, essential in identifying rock types such as granites and basalts. Albite twinning (parallel striations), low birefringence (0.008–0.013), and varies from colorless (albite) to gray (anorthite). Zoning patterns are common in igneous and metamorphic rocks[40](#ref-CR40 “Home - Handbook of Mineralogy. https://handbookofmineralogy.org/

(Accessed: 17th June 2025).“),44.

Potassium-Feldspar (Alkali feldspar; KAlSi₃O₈): A feldspar group mineral, low birefringence (0.007–0.01), often displays Carlsbad twinning, and appears colorless to pale pink in plane-polarized light. Perthitic textures (exsolution lamellae) are diagnostic under high magnification, commonly dominant in granites and pegmatites[40](#ref-CR40 “Home - Handbook of Mineralogy. https://handbookofmineralogy.org/

(Accessed: 17th June 2025).“),44.

Quartz (SiO₂): Uniaxial positive, low birefringence (0.009), and lacks cleavage. Appears colorless with undulatory extinction in strained crystals. Ubiquitous in granites, sandstones, and hydrothermal veins[8](https://www.nature.com/articles/s41597-025-05879-9#ref-CR8 “Quartz. https://www.science.smith.edu/geosciences/petrology/petrography/quartz/quartz.html

(Accessed: 25th June 2025).“),[40](#ref-CR40 “Home - Handbook of Mineralogy. https://handbookofmineralogy.org/

(Accessed: 17th June 2025).“),41,[42](#ref-CR42 “Optical Properties of Minerals » Geology Science. https://geologyscience.com/geology/optical-properties-of-minerals/

(Accessed: 17th June 2025).“),43.

Sample collection methodology

The MUMDMC2025 dataset addresses these limitations, which are mentioned above in the Datasets Limitations section, through systematic sample collection from the Eastern Desert of Egypt (Wadi Fatira El-beida), a Precambrian basement terrain renowned for its exceptional mineralogical diversity within granite and granodiorite formations. These plutonic rocks exhibit varied mineral assemblages and textural characteristics representative of diverse geological environments, providing ideal specimens for comprehensive optical property documentation.

Thin-section preparation followed rigorous standardized protocols, with samples cut to 30 μm thickness (±2 μm tolerance) using precision diamond-wafering techniques. Sequential polishing procedures culminated in final treatment with 0.3 μm alumina slurry to achieve optical-grade surface quality essential for high-resolution imaging applications45.

360° rotational imaging approach

The dataset’s distinguishing feature involves comprehensive rotational imaging at 5° increments across complete 360° rotations, capturing the full spectrum of anisotropic optical properties. This approach documents pleochroism variations (hornblende displaying characteristic green to brown color transitions), extinction angle progressions (plagioclase exhibiting 0°–20° extinction ranges), and complete birefringence sequences (biotite showing 0.04–0.08 birefringence variations). The resulting 14,400 images (72 rotations × 5 mineral species × 2 polarization modes × 20 individual crystals) provide unprecedented comprehensive optical characterization.

Subset selection rationale

To balance statistical significance with the practical storage limitations of open repositories, we carefully curated a publicly available subset of the dataset. This subset comprises 2,500 cross-polarized light images, with 500 images allocated per mineral class. This balanced selection ensures the dataset remains manageable for broad accessibility while still providing a robust sample for research. The full sample of this dataset can be accessed at figshare.com[46](https://www.nature.com/articles/s41597-025-05879-9#ref-CR46 “Amer, B. G. MUMDMC2025_DataSet_sample. figshare, https://doi.org/10.6084/m9.figshare.29483204.v1

. https://figshare.com/articles/dataset/MUMDMC2025_DataSet_sample/29483204?file=55998200

(2025).“).

Machine Learning in Mineralogy

Machine learning has demonstrated practical efficacy in mineral classification, as evidenced by recent studies employing DT and KNN. For instance, DT used to classify 10 minerals from SEM/EDS data, achieving robust accuracy by leveraging elemental composition as decision attributes47. Similarly, KNN has been successfully applied to thin-section analysis, reporting high accuracy (>90%) in pore-type identification in carbonate rocks when combined with SVM and fuzzy fusion48. Further, KNN’s utility is highlighted in geochemical discrimination, while its role is showcased in multi-label mineral image classification (>85% mean average precision)48,[49](https://www.nature.com/articles/s41597-025-05879-9#ref-CR49 “Autonomous Mineral Classification Enhances Planetary Exploration. https://www.spectroscopyonline.com/view/autonomous-mineral-classification-enhances-planetary-exploration

(Accessed: 17th June 2025).“). These examples underscore ML’s adaptability to diverse datasets—from spectral (LIBS, Raman) to optical (thin sections)—validating its emergence as a transformative tool for mineralogy22,[49](https://www.nature.com/articles/s41597-025-05879-9#ref-CR49 “Autonomous Mineral Classification Enhances Planetary Exploration. https://www.spectroscopyonline.com/view/autonomous-mineral-classification-enhances-planetary-exploration

(Accessed: 17th June 2025).“).

Historically, the use of ML in mineral classification was limited by the lack of high-quality labelled datasets, which are essential for training supervised learning models22. As more datasets like become available, there is growing potential for improving classification outcomes, even in cases where minerals exhibit similar optical properties50. By leveraging the strengths of ML models, geologists can improve the speed and accuracy of mineral identification, reducing the need for labor-intensive manual methods18.

The MUMDMC2025 dataset comprises granite and granodiorite samples from the Eastern Desert, Egypt (Wadi Fatira El-beida), a well-documented Precambrian basement terrain known for its mineralogical diversity. Samples were selected to represent varied textures (e.g., porphyritic K-feldspars, myrmekitic plagioclase-quartz intergrowths) and alteration states (e.g., chloritized biotite, saussuritized plagioclase), ensuring coverage of both pristine and weathered phases common in igneous systems. This aligns with established petrographic standards for granite classification and addresses texture variability critical for ML robustness, as emphasized in recent studies on automated mineralogy. The inclusion of these rock types—granite (silicic) and granodiorite (intermediate)—provides a compositional spectrum that enhances model generalizability, as demonstrated in similar ML works targeting plutonic rocks51,[52](https://www.nature.com/articles/s41597-025-05879-9#ref-CR52 “BRITROCKS: mineralogy and petrology collections database - British Geological Survey. https://www.bgs.ac.uk/technologies/databases/bgs-rock-collections/

(Accessed: 17th June 2025).“).

Machine learning algorithms

The Decision Tree and K-Nearest Neighbors machine learning models, which are two of the simplest ML models, are widely recognized for their interpretability and minimal training complexity53,54. DTs use hierarchical rule-based splitting54, while KNN relies on instance proximity without parametric assumptions. Both serve as introductory algorithms in ML due to their conceptual transparency54,[55](https://www.nature.com/articles/s41597-025-05879-9#ref-CR55 “What is the k-nearest neighbors algorithm? | IBM. https://www.ibm.com/think/topics/knn

(Accessed: 25th June 2025).“).

The MUMDMC2025 dataset is evaluated with two established machine learning models for mineral classification. The Decision Tree (DT) algorithm, a supervised learning method, recursively partitions the feature space by optimizing splits at each node using impurity measures, including the Gini impurity (a metric ranging from 0 to 1 that quantifies the probability of misclassifying a randomly chosen element)56. This splitting continues until reaching terminal nodes containing homogeneous class distributions. The second model, K-Nearest Neighbors (KNN), classifies samples by comparing feature-space distances (typically Euclidean) to k surrounding labelled instances57,[58](#ref-CR58 “Nearest neighbor pattern classification | IEEE Journals & Magazine | IEEE Xplore. https://ieeexplore.ieee.org/document/1053964

(Accessed: 17th June 2025).“),59. While DT offers interpretability through its tree structure and impurity-based decision rules56, KNN provides flexibility in handling complex decision boundaries without parametric assumptions57,[58](#ref-CR58 “Nearest neighbor pattern classification | IEEE Journals & Magazine | IEEE Xplore. https://ieeexplore.ieee.org/document/1053964

(Accessed: 17th June 2025).“),59.

Validation overview

Initial benchmarking demonstrates the dataset’s superior performance compared to existing collections. K-Nearest Neighbors classification achieved 87.6% accuracy with an F1-score of 0.876, while Decision Tree algorithms reached 71.1% accuracy with an F1-score of 0.711. Per-mineral discrimination analysis revealed exceptional performance, with quartz achieving near-perfect identification (AUC = 0.98) attributable to its distinctive uniaxial interference patterns.

Importantly, observed misclassifications align with established geological knowledge, including systematic hornblende-plagioclase confusion at specific extinction angles, validating the dataset’s geological authenticity. These results demonstrate the dataset’s capacity to reduce identification time from hours to seconds while maintaining rigorous petrographic standards, establishing its utility for both research applications and industrial mineral assessment protocols.

The most important consideration, the advancement of automated mineral identification in petrographic analysis has been significantly constrained by the scarcity of comprehensive, publicly accessible datasets that adequately represent the optical complexity of minerals in thin-section microscopy. Existing public datasets exhibit fundamental limitations in scope, class diversity, and optical characterization, while proprietary datasets developed by research institutions and industry remain largely inaccessible to the broader scientific community. This data scarcity creates a critical bottleneck in developing robust machine learning frameworks capable of accurate mineral classification across diverse geological contexts and imaging conditions.

To address these fundamental limitations, the Menoufia University Machine Learning Dataset for Minerals Classification 2025 (MUMDMC2025) was presented, a meticulously curated collection designed to establish new standards for petrographic dataset development. The dataset comprises 14,400 high-resolution labelled photomicrographs representing five economically and geologically significant mineral classes: Biotite, Hornblende, Plagioclase, Potassium-Feldspar, and Quartz. Each mineral class maintains perfect balance with 2,880 images, eliminating class imbalance concerns that frequently compromise machine learning model performance in geological applications.

The dataset’s primary innovation lies in its comprehensive optical documentation protocol, systematically capturing images under both Cross-Polarized Light (XPL) and Plane-Polarized Light (PPL) conditions across complete 360° rotational sequences at precise 5° increments. This methodological approach ensures complete characterization of anisotropic optical properties, including pleochroism, birefringence variations, and extinction patterns that are fundamental to reliable mineral identification but inadequately represented in existing datasets.

Comprehensive metadata accompanies each image, documenting mineral classification, acquisition parameters, optical conditions, and rotational orientation, thereby supporting both supervised learning applications and detailed optical mineralogy research. Initial validation using Decision Tree (DT) and K-Nearest Neighbors (KNN) algorithms demonstrates the dataset’s effectiveness, with KNN achieving 87.6% classification accuracy compared to 74.7% for DT, establishing baseline performance metrics for future comparative studies.

MUMDMC2025 addresses critical gaps in current petrographic datasets by providing balanced representation, comprehensive optical documentation, and standardized acquisition protocols that facilitate reproducible machine learning research. The dataset serves multiple research applications, including automated petrographic analysis system development, quantitative mineralogy advancement, and educational applications in optical mineralogy. By adhering to FAIR (Findable, Accessible, Interoperable, and Reusable) data principles and maintaining open accessibility28,[60](https://www.nature.com/articles/s41597-025-05879-9#ref-CR60 “FAIR Principles | NNLM. https://www.nnlm.gov/guides/data-thesaurus/fair-principles

(Accessed: 25th June 2025).“), MUMDMC2025 enables collaborative advancement of computer vision applications in geological sciences and supports the development of next-generation automated mineral identification systems.

Methods

Sample collection and preparation

Rock samples were systematically collected from the Eastern Desert of Egypt, specifically from Wadi Fatira El-beida. This region represents a Precambrian basement complex characterized by extensive granite and granodiorite intrusions that exhibit exceptional mineralogical diversity61,62. The selected formations are of particular economic significance, containing valuable industrial minerals including high-purity quartz suitable for silicon production and alkali feldspars essential for ceramic manufacturing. These plutonic rocks display a remarkable range of textural features, from coarse-grained porphyritic textures with euhedral K-feldspar phenocrysts to complex myrmekitic intergrowths between plagioclase and quartz, alongside various degrees of hydrothermal alteration, including chloritization of primary biotite61,62.

The preparation of petrographic thin sections followed rigorous standardized protocols to ensure optimal optical quality and consistency across all samples45. Initial sample cutting was performed using a precision diamond-wafering blade, with each section cut to a standard thickness of 30 μm. This thickness specification was critical for achieving proper optical interference colors and accurate mineral identification under polarized light microscopy. The cutting process was followed by sequential grinding using progressively finer silicon carbide abrasives, beginning with 120-mesh grit and advancing through 220, 400, 600, and 1200-mesh stages. Final polishing was accomplished using a 0.3 μm alumina slurry to achieve the optical clarity necessary for high-resolution imaging45.

Quality control measures were implemented throughout the preparation process to maintain consistency and eliminate substandard sections. Thickness uniformity was verified using standard interference color charts, with sections deviating more than ± 2 μm from the target thickness being discarded and re-prepared. All thin sections were mounted on standard glass slides using epoxy resin and subsequently stored in desiccated containers at 25 °C to prevent moisture-induced artifacts that could compromise optical properties during extended storage periods45.

Microscopy and imaging protocol

The imaging system utilized a Euromex iScope polarizing microscope equipped with professional-grade optical components specifically selected for consistent, high-quality mineral identification, as illustrated in Fig. 1, which presents the polarized microscope used (Euromex) to capture the photomicrographs in the current study. The microscope was fitted with a CMEX-10pro digital camera featuring 10.0 megapixel resolution, USB 2.0 interface, and 24-bit RGB color depth to capture the full spectrum of optical properties exhibited by the mineral phases. A Pli-Pol 5 × /0.12 objective lens provided consistent magnification across all samples, while the transmitted light illumination system incorporated an adjustable halogen source for optimal contrast and color fidelity.

Fig. 1

The polarized microscope used (Euromex) to capture the photomicrographs in the current study.

Critical imaging parameters were standardized to ensure reproducibility and comparability across the entire dataset. All images were captured at 600 DPI resolution, producing files with dimensions of 3,584 × 2,746 pixels in RGB24 color space. Exposure time was fixed at 1/8 second to minimize noise while maintaining adequate signal intensity across the range of mineral birefringence values encountered, as shown in Table 1.

Comprehensive calibration procedures were implemented prior to each imaging session to eliminate systematic errors and ensure accurate color representation. Dark field correction was performed by completely covering the objective lens and applying a correction factor of 99 using the Euromex ImageFocusAlpha (version 1.3.7.15674, built on Oct 8, 2019) software to eliminate residual sensor noise. Flat field correction utilized a certified reference slide with uniform illumination characteristics, with the correction factor similarly set to 99 to compensate for uneven illumination across the field of view. White balance calibration was performed against a high-purity quartz standard under plane-polarized light conditions to ensure accurate color reproduction of mineral optical properties. All imaging was conducted in light-sealed conditions to eliminate ambient light interference that could compromise image quality, as illustrated in Fig. 2, which shows the Main stages of the Image Acquisition Methodology.

Fig. 2

Main stages of the Image Acquisition Methodology.

Fig. 3

Screenshot of the ImageFocusAlpha software interface during the capture of a Plane-Polarized Light (PPL) image.

Fig. 4

Flowchart illustrating the image capture process for microscopic thin sections.

The Image Acquisition Methodology is listed in the following steps:

Initialize Environment:

Switch off all laboratory lights.

Microscope Setup:

Adjust the microscope to a 5x magnification.

Connect the microscope to a laptop.

Open the “Image Focus” software.

Dark Field Correction:

Cover the microscope lens completely.

Click on the “Dark Field Correction” tab in the software.

Set the correction quantity to 99.

Click the “capture” button to initiate the correction process (keeping the lens covered until complete).

Enable the correction.

Flat Field Correction:

Remove the lens cover.

Click on the “Flat Field Correction” tab in the software.

Set the correction quantity to 99.

Click the “capture” button to initiate the correction process.

Enable the correction.

Calibration:

Specify the desired image format (i.e., RGB24), snap mode, and live mode, as shown in Fig. 3, which presents a screenshot of the ImageFocusAlpha software interface during the capture of a Plane-Polarized Light (PPL) image.

Perform white balance adjustment.

Image Capture Process:

Figure 4 illustrates the image capture process for microscopic thin sections. The flowchart outlines a comprehensive methodology that integrates optical microscopy techniques with digital image processing for developing mineralogical datasets.

The imaging protocol begins with the positioning of thin sections containing target mineral samples at an initi

Background & Summary

Geological Context and Motivation

Background & Summary

Geological Context and Motivation

Datasets Limitations

Limitations of the not publicly available datasets

Limitations of the publicly available datasets

MUMDMC2025 Contributions

Importance of the Selected Minerals Identification and Rock Classification

Properties of the selected minerals

(Accessed: 17th June 2025).“).

(Accessed: 17th June 2025).“),43.

(Accessed: 17th June 2025).“),44.

(Accessed: 17th June 2025).“),44.

Sample collection methodology

360° rotational imaging approach

Subset selection rationale

Machine Learning in Mineralogy

Machine learning algorithms

Validation overview

Methods

Sample collection and preparation

Microscopy and imaging protocol

Adjust the microscope to a 5x magnification.

Connect the microscope to a laptop.

Cover the microscope lens completely.

Click on the “Dark Field Correction” tab in the software.

Set the correction quantity to 99.

Click the “capture” button to initiate the correction process (keeping the lens covered until complete).

Remove the lens cover.

Click on the “Flat Field Correction” tab in the software.

Set the correction quantity to 99.

Click the “capture” button to initiate the correction process.

Specify the desired image format (i.e., RGB24), snap mode, and live mode, as shown in Fig. 3, which presents a screenshot of the ImageFocusAlpha software interface during the capture of a Plane-Polarized Light (PPL) image.

Figure 4 illustrates the image capture process for microscopic thin sections. The flowchart outlines a comprehensive methodology that integrates optical microscopy techniques with digital image processing for developing mineralogical datasets.

Similar Posts