Run YOLOv12 on Windows: Step-by-Step Installation Guide

YOLOv12 represents a cutting-edge advancement in object detection, integrating attention mechanisms to enhance detection accuracy while maintaining the computational efficiency characteristic of previous YOLO iterations.
This guide delineates the procedural framework for configuring, training, and deploying YOLOv12 on a Windows-based system, encompassing installation protocols, dataset preparation, model training, and inference methodologies.
Step 1: Environment Configuration
Prior to initiating the setup, ensure that the following requisite software components are installed:
Essential Software and Tools:
- Python: Download and install Python (version 3.8 or later) from the official website, ensuring compatibility with YOLOv12 dependencies.
- Git: Install Git to facilitate repository cloning and version control.
- Integrated Development Environment (IDE): Utilize Visual Studio Code (VS Code) or an alternative IDE for script editing and execution.
- NVIDIA GPU Drivers: If deploying GPU acceleration, install the latest NVIDIA drivers to ensure optimal performance.
- CUDA and cuDNN: Required for GPU-based model training and inference.
Dependency Installation
Additional Dependencies: If utilizing external dataset management tools:
pip install roboflow supervision
Installing Python Dependencies: Execute:
pip install -r requirements.txt
pip install -e .
If errors occur, ensure setuptools
and wheel
are updated:
pip install --upgrade setuptools wheel
Cloning the YOLOv12 Repository: Execute the following commands in the terminal or command prompt:
git clone https://github.com/sunsmarterjie/yolov12.git
cd yolov12
Step 2: Dataset Preparation
YOLOv12 employs a dataset format akin to YOLOv8. The following steps ensure proper data structuring:
Dataset Formatting:
- YOLOv8 PyTorch TXT Format: Each image must have a corresponding
.txt
file with bounding box annotations.
Utilizing Roboflow for Dataset Management:
- Creating a Roboflow Account: Register for a Roboflow account to facilitate dataset organization and download.
- Dataset Conversion (if required): Convert non-YOLO formats (e.g., COCO JSON) using Roboflow utilities.
- Creating
data.yaml
Configuration:
train: path/to/train/images
val: path/to/validation/images
test: path/to/test/images
nc: number_of_classes
names: ['class1', 'class2', ...]
- Dataset Retrieval via API:
from roboflow import Roboflow
ROBOFLOW_API_KEY = "YOUR_API_KEY_HERE"
rf = Roboflow(api_key=ROBOFLOW_API_KEY)
project = rf.workspace("your-workspace").project("your-project")
version = project.version("your-version")
dataset = version.download("yolov8")
Step 3: Model Training
Executing Training Commands:
Utilize the following script, specifying the model configuration and dataset path:
from ultralytics import YOLO
model = YOLO('yolov12s.yaml')
results = model.train(
data='path/to/your/data.yaml',
epochs=250,
)
Real-World Implementation: Traffic Object Detection
model = YOLO('yolov12m.yaml')
results = model.train(
data='traffic_data.yaml',
epochs=300,
batch=16,
imgsz=640
)
This example demonstrates YOLOv12’s capability in detecting traffic-related entities such as vehicles, pedestrians, and signals.
Step 4: Inference Execution
Single-Image Inference:
from ultralytics import YOLO
model = YOLO('path/to/trained/model.pt')
results = model('path/to/image.jpg')
results.show()
Real-Time Video Inference:
import cv2
from ultralytics import YOLO
model = YOLO('yolov12s.pt')
cap = cv2.VideoCapture(0)
while cap.isOpened():
ret, frame = cap.read()
if not ret:
break
results = model(frame)
cv2.imshow('YOLOv12 Detection', results.render()[0])
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
Step 5: Model Deployment
Trained YOLOv12 models can be integrated into various deployment frameworks. Below is an API deployment example using Flask:
Deploying YOLOv12 via Flask API:
from flask import Flask, request, jsonify
from ultralytics import YOLO
from PIL import Image
import io
app = Flask(__name__)
model = YOLO('yolov12s.pt')
@app.route('/detect', methods=['POST'])
def detect():
image = Image.open(io.BytesIO(request.files['image'].read()))
results = model(image)
return jsonify(results.pandas().xyxy[0].to_dict())
if __name__ == '__main__':
app.run(debug=True)
This API facilitates object detection by accepting image uploads and returning results in JSON format.
Conclusion
The implementation of YOLOv12 on Windows necessitates meticulous configuration, ranging from software installation to real-time inference and deployment.
By adhering to the outlined procedures, practitioners can effectively leverage YOLOv12’s capabilities for advanced object detection tasks, ensuring both accuracy and efficiency in real-world applications.