| --- |
| license: mit |
| language: |
| - en |
| metrics: |
| - accuracy |
| base_model: |
| - google/efficientnet-b0 |
| pipeline_tag: image-classification |
| --- |
| |
|
|
| <h1 style="color:rebeccapurple; font-weight: bold;">🍽️ Model Card for Food Vision Model</h1> |
|
|
| This model is an image classification model trained to identify different types of food from images. It was developed as part of a Food Vision project, utilizing transfer learning on a pre-trained convolutional neural network. |
|
|
| --- |
|
|
| # Model Details |
|
|
| ### Model Description |
|
|
| This model is a deep learning model for classifying food images into one of 101 categories from the Food101 dataset. It was trained using TensorFlow and employs a transfer learning approach, leveraging the features learned by a model pre-trained on a large dataset like ImageNet. The training process included the use of mixed precision for potentially faster training and reduced memory usage. |
|
|
| --- |
|
|
| <h1 style="color:#b4464b; font-weight: bold;">⚛️ HuggingFace Space for Food Vision Model</h1> |
| <p><a href="https://huggingface.co/spaces/Recompense/FoodVision" style="color:hotpink; font-weight:bold;">Use it here</a></p> |
|
|
| --- |
|
|
| * **Developed by:** `Recompense` Me! |
| * **Model type:** Image Classification (Transfer Learning with a CNN backbone) |
| * **Language(s) (NLP):** N/A (Image Classification) |
| * **License:** MIT |
| * **Finetuned from model:** EfficienntNetB0 |
|
|
|
|
| # Uses |
|
|
| This model is intended for classifying images of food into 101 distinct categories. Potential use cases include: |
|
|
| * Food recognition in mobile applications. |
| * Organizing food images in databases. |
| * Assisting in dietary tracking or recipe suggestions based on images. |
|
|
| --- |
|
|
| # Limitations |
|
|
| * **Dataset Bias:** The model is trained on the Food101 dataset. Its performance may degrade on food images that are significantly different in style, presentation, or origin from those in the training data. |
| * **Image Quality:** Performance can be affected by image quality, lighting conditions, occlusions, and variations in food presentation. |
| * **Specificity:** While it classifies into 101 categories, it may not distinguish between very similar dishes or variations within a category. |
|
|
| --- |
|
|
| # Evaluation |
|
|
| The model's performance was evaluated using standard classification metrics on a validation set from the Food101 dataset. |
|
|
| #### Testing Data |
|
|
| The model was evaluated on the validation split of the Food101 dataset. |
|
|
| * **Food101 Dataset:** A dataset of 101 food categories, with 101,000 images. 750 training images and 250 testing images per class. |
| * **Source:** [TensorFlow Datasets](https://www.tensorflow.org/datasets/catalog/food101) |
|
|
| #### Factors |
|
|
| Evaluation was performed on the overall validation dataset. Further analysis could involve disaggregating performance by individual food categories to identify classes where the model performs better or worse. |
|
|
| #### Metrics |
|
|
| The primary evaluation metric used is Accuracy. A confusion matrix was also generated to visualize per-class performance. |
|
|
| * **Accuracy:** The proportion of correctly classified images out of the total number of images evaluated. |
|
|
| $$ |
| \text{Accuracy} = \frac{\text{Number of correct predictions}}{\text{Total number of predictions}} |
| $$ |
|
|
|
|
| * **Confusion Matrix:** A table that visualizes the performance of a classification model. Each row represents the instances in an actual class, while each column represents the instances in a predicted class. |
|
|
| ### Results |
|
|
| 80%+ accuracy on validation data |
|
|
| #### Summary |
|
|
| Transfer learning helped the model achieve greater accuracy, though the model struggled with food closely related to each other indicating more data was needed. The Dataset used a lot but more data is still needed to differentiate between closely looking food. |
|
|
| --- |
|
|
| # Environmental Impact |
|
|
| Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). |
|
|
| * **Hardware Type:** Tesla T4 |
| * **Hours used:** 1 hour estimate(max) |
| * **Cloud Provider:** Google Cloud |
| * **Compute Region:** us-central |
| * **Carbon Emitted:** <span style="color:gold;"><b>80 grams of CO₂eq</b></span> (estimated) |
|
|
| --- |
|
|
| # Technical Specifications |
|
|
| ### Model Architecture and Objective |
|
|
| The model is a fine-tuned convolutional neural network (CNN) classifier.Mixed precision training was used for faster training, a modern CNN architecture compatible with `float16` data types. The objective is to minimize the classification loss (e.g., categorical cross-entropy) to accurately predict the food category given an image. |
|
|
| ### Compute Infrastructure |
|
|
| The model was trained using a Tesla T4 GPU on Google Cloud in the us-central region. The estimated carbon emissions for 1 hour of training time on this setup are 80 grams of CO2eq. The environment was intended to support mixed precision training. |
|
|
| ### Software |
|
|
| * TensorFlow |
| * TensorFlow Datasets |
| * NumPy |
| * Matplotlib |
| * Scikit-learn |
| * Helper functions from `helper_functions.py` (for plotting, data handling) |
|
|
| --- |
|
|
| # Usage |
|
|
| Here's an example of how to use the model for inference on a new image. |
|
|
| First, make sure you have TensorFlow installed: |
|
|
| ```bash |
| pip install tensorflow |
| ``` |
|
|
| Then, you can load the model and make a prediction: |
|
|
| ```python |
| import tensorflow as tf |
| import matplotlib.pyplot as plt |
| import numpy as np |
| import os |
| import keras |
| |
| # Available backend options are: "jax", "torch", "tensorflow". |
| |
| os.environ["KERAS_BACKEND"] = "jax" |
| |
| loaded_model = keras.saving.load_model("hf://Recompense/FoodVision") |
| |
| # Define the class names (replace with the actual class names from your training) |
| # to test the model intitially you can use these class names and upload an image based on any class you choose |
| class_names = ['apple_pie', 'baby_back_ribs', 'baklava', 'beef_carpaccio', 'beef_tartare', 'beet_salad', 'beignets', 'bibimbap', 'bread_pudding', 'breakfast_burrito', 'bruschetta', 'buffalo_wings', 'caesar_salad', 'cannoli', 'caprese_salad', 'carrot_cake', 'cheesecake', 'cheese_plate', 'chicken_curry', 'chicken_quesadilla', 'chicken_wings', 'chocolate_cake', 'chocolate_mousse', 'churros', 'clam_chowder', 'club_sandwich', 'crab_cakes', 'creme_brulee', 'croque_madame', 'cup_cakes', 'deviled_eggs', 'donuts', 'dumplings', 'edamame', 'eggs_benedict', 'escargots', 'falafel', 'filet_mignon', 'fish_and_chips', 'foie_gras', 'french_fries', 'french_onion_soup', 'french_toast', 'fried_calamari', 'fried_chicken', 'frozen_yogurt', 'garlic_bread', 'gnocchi', 'greek_salad', 'grilled_cheese_sandwich', 'grilled_salmon', 'guacamole', 'gyros', 'hamburger', 'hot_dog', 'ice_cream', 'lasagna', 'lobster_bisque', 'lobster_roll_sandwich', 'macaroni_and_cheese', 'macarons', 'miso_soup', 'mussels', 'nachos', 'omelette', 'onion_rings', 'oysters', 'pad_thai', 'paella', 'pancakes', 'panna_cotta', 'peking_duck', 'pho', 'pizza', 'pork_chop', 'poutine', 'prime_rib', 'pulled_pork_sandwich', 'ramen', 'ravioli', 'red_velvet_cake', 'risotto', 'samosas', 'sashimi', 'scallops', 'shrimp_scampi', 'smores', 'spaghetti_bolognese', 'spaghetti_carbonara', 'spring_rolls', 'steak', 'strawberry_shortcake', 'sushi', 'tacos', 'takoyaki', 'tiramisu', 'tuna_tartare', 'waffles'] # Example class names |
| |
| # Create a function to load and prepare images (from your notebook) |
| def load_prep_image(filepath, img_shape=224, scale=True): |
| """ |
| Reads in an image and preprocesses it for model prediction |
| |
| Args: |
| filepath (str): path to target image |
| img_shape (int): shape to resize image to. Default = 224 |
| scale (bool): Condition to scale image. Default = True |
| |
| Returns: |
| Image Tensor of shape (img_shape, img_shape, 3) |
| """ |
| image = tf.io.read_file(filepath) |
| image_tensor = tf.io.decode_image(image, channels=3) |
| image_tensor = tf.image.resize(image_tensor, [img_shape, img_shape]) |
| if scale: |
| # Scale image tensor to be between 0 and 1 |
| scaled_image_tensor = image_tensor / 255. |
| return scaled_image_tensor |
| else: |
| return image_tensor |
| |
| # Load and preprocess a sample image |
| # Replace 'path/to/your/image.jpg' with the actual path to your image |
| sample_image_path = 'path/to/your/image.jpg' |
| prepared_image = load_prep_image(sample_image_path) |
| |
| # Add a batch dimension to the image |
| prepared_image = tf.expand_dims(prepared_image, axis=0) |
| |
| # Make a prediction |
| predictions = loaded_model.predict(prepared_image) |
| |
| # Get the predicted class index |
| predicted_class_index = np.argmax(predictions) |
| |
| # Get the predicted class name |
| predicted_class_name = class_names[predicted_class_index] |
| |
| # Print the prediction |
| print(f"The predicted food item is: {predicted_class_name}") |
| |
| # Optional: Display the image |
| # img = plt.imread(sample_image_path) |
| # plt.imshow(img) |
| # plt.title(f"Prediction: {predicted_class_name}") |
| # plt.axis('off') |
| # plt.show() |
| ``` |