ABOUT ME

-

Today
-
Yesterday
-
Total
-
  • Step by step object detection, STEP 1. ๋ฐ์ดํ„ฐ์…‹ ์ค€๋น„ํ•˜๊ธฐ
    Computer Science/๊ฐœ๋ฐœ 2023. 10. 1. 16:51
    ๋ฐ˜์‘ํ˜•

    ๊ฐ์ฒด ๊ฒ€์ถœ ํ•™์Šต ๋ชจ๋ธ ๊ตฌํ˜„์„ ๋‹จ๊ณ„๋ณ„๋กœ ์ •๋ฆฌํ•˜๋Š” ํฌ์ŠคํŠธ๋ฅผ ์ž‘์„ฑํ•ด๋ณด๋ ค๊ณ  ํ•œ๋‹ค. 

    ์€๊ทผ ์ž์ž˜ํ•˜๊ฒŒ ์‹ ๊ฒฝ์“ธ๊ฒŒ ๋งŽ์•„์„œ ์ •๋ฆฌํ•ด๋‘๋ฉด ๋‚˜์ค‘์— ํ™œ์šฉํ•˜๊ธฐ์— ์ข‹์„ ๊ฒƒ ๊ฐ™๋‹ค.

     

    ๋จผ์ € ์ด ๊ณผ์ •์€ sgrvinod ๊นƒํ—ˆ๋ธŒ์˜ 'Deep Tutorials for PyTorch' ํŠœํ† ๋ฆฌ์–ผ์„ ์ฐธ๊ณ ํ•ด์„œ ์ง„ํ–‰๋˜์—ˆ๋‹ค. 

    ๋‚ด๊ฐ€ ์‚ฌ์šฉํ•œ ๊ฐ์ฒด ๊ฒ€์ถœ ๋ฐ์ดํ„ฐ์…‹์€ AI HUB ์˜ '๊ฑด๊ฐ•๊ด€๋ฆฌ๋ฅผ ์œ„ํ•œ ์Œ์‹ ์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ' ์ด๋‹ค. 

     

    ์ด ๋ฐ์ดํ„ฐ์…‹์—๋Š” ์Œ์‹๊ณผ ๊ด€๋ จ๋œ 500์—ฌ๊ฐœ ์นดํ…Œ๊ณ ๋ฆฌ๊ฐ€ ์กด์žฌํ–ˆ๋Š”๋ฐ, ์ปดํ“จํŒ…์  ๋ฆฌ์†Œ์Šค๋ฅผ ๊ณ ๋ คํ•˜์—ฌ, ๊ณผ์ผ๊ณผ ์ฑ„์†Œ ๋Œ€๋ถ„๋ฅ˜ ์ค‘ ์ผ๋ถ€ ์นดํ…Œ๊ณ ๋ฆฌ๋งŒ์„ ์„ ์ •ํ•˜์—ฌ ๊ฐ์ฒด ๊ฒ€์ถœ์„ ์ง„ํ–‰ํ•˜๊ธฐ๋กœ ํ–ˆ๋‹ค. 

    ๊ทธ๋ฆฌํ•˜์—ฌ ์„ ์ •๋œ ์นดํ…Œ๊ณ ๋ฆฌ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค.

    • ๊ณผ์ผ: ๋ง๊ณ (1), ๋ฉœ๋ก (2), ๋”ธ๊ธฐ(3), ๋ธ”๋ฃจ๋ฒ ๋ฆฌ(4), ์‚ฌ๊ณผ(5), ์ž๋ชฝ(6), ์ž๋‘(7), ์ฒœ๋„๋ณต์ˆญ์•„(8), ์ฒญํฌ๋„(9), ์ฒด๋ฆฌ(10)
    • ์ฑ„์†Œ: ๊ตฐ๊ณ ๊ตฌ๋งˆ(11), ๊ตฐ๋ฐค(12), ๊ณ ์ถ”(13), ์น˜์ปค๋ฆฌ(14), ์ฝœ๋ผ๋น„(15), ํŒŒํ”„๋ฆฌ์นด(16), ํ† ๋งˆํ† (17), ํ‘œ๊ณ ๋ฒ„์„ฏ(18), ๋‹จํ˜ธ๋ฐ•(19), ํ”ผ๋ง(20)

     

    STEP 1. ๋ฐ์ดํ„ฐ์…‹ ์ค€๋น„ํ•˜๊ธฐ

     

    1) ๋žœ๋ค ์ƒ˜ํ”Œ๋ง

    ๊ฐ ์นดํ…Œ๊ณ ๋ฆฌ ๋ณ„๋กœ raw data ๋ฅผ ๋‹ค์šด๋ฐ›์•„ ๋ณด๋ฉด ํ•œ ์นดํ…Œ๊ณ ๋ฆฌ ์•ˆ์—๋„ ์ด๋ฏธ์ง€๊ฐ€ ๋„ˆ๋ฌด ๋งŽ๋‹ค. ๊ทธ๋ž˜์„œ ํ•™์Šต ๋ฐ์ดํ„ฐ๋กœ 400์žฅ, validation ๋ฐ์ดํ„ฐ๋กœ 50์žฅ, ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ๋กœ 50์žฅ์„ ๋ฝ‘์•„์„œ ํ™œ์šฉํ•  ๊ฒƒ์ด๋‹ค.

    ์ด๋ฅผ ์œ„ํ•ด ๋žœ๋ค ์ƒ˜ํ”Œ๋ง์„ 500๊ฐœ ๋งŒํผ ์ง„ํ–‰ํ•˜์˜€๋‹ค.

    ๋ชจ๋“  ์นดํ…Œ๊ณ ๋ฆฌ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์€ ๊ตฌ์„ฑ์œผ๋กœ ์ด๋ฏธ์ง€์™€ ๋ผ๋ฒจ์ด ์ €์žฅ๋˜์–ด์žˆ๋‹ค. ์ด๊ณณ (source) ์—์„œ 500์žฅ์„ ๋žœ๋ค ์ƒ˜ํ”Œ๋งํ•ด์„œ ๋ณต์‚ฌํ•˜๊ณ  ์™ธ๋ถ€ ํด๋”๋กœ ๋ถ™์—ฌ๋„ฃ๊ธฐ๋ฅผ ํ•  ๊ฒƒ์ด๋‹ค. ๋”ฐ๋ผ์„œ ์™ธ๋ถ€์— ๋ธ”๋ฃจ๋ฒ ๋ฆฌ๋ผ๋Š” ํด๋”๋ฅผ ๋งŒ๋“ค๊ณ  ํ•˜์œ„ ํด๋”๋กœ images์™€ labels ๋ฅผ ๋งŒ๋“ ๋‹ค.

    < ๊ธฐ์กด ๋ฐ์ดํ„ฐ์…‹ - ๋ธ”๋ฃจ๋ฒ ๋ฆฌ ํด๋” >

    ๐Ÿ“ฆ [์›์ฒœ] ์Œ์‹204_Tra
    โ””โ”€ ๋ธ”๋ฃจ๋ฒ ๋ฆฌ
       โ”œโ”€ A220121XX_03592.jpg
       โ”œโ”€ A220121XX_03593.jpg
       โ””โ”€ (์ดํ•˜ ์ƒ๋žต)
    ๐Ÿ“ฆ [๋ผ๋ฒจ] ์Œ์‹204_Tra
    โ””โ”€ ๋ธ”๋ฃจ๋ฒ ๋ฆฌ
       โ”œโ”€ A220121XX_03592.json
       โ”œโ”€ A220121XX_03593.json
       โ””โ”€ (์ดํ•˜ ์ƒ๋žต)

    < ์ƒˆ๋กœ ๋งŒ๋“  ํด๋” >

    ๐Ÿ“ฆ ๋ธ”๋ฃจ๋ฒ ๋ฆฌ
    โ”œโ”€ images
    โ””โ”€ labels

    ๋žœ๋ค ์ƒ˜ํ”Œ๋งํ•˜์—ฌ ๋ณต์‚ฌ ๋ถ™์—ฌ๋„ฃ๊ธฐ๋ฅผ ํ•˜๋Š” ์ฝ”๋“œ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค. (500_random_sampling.ipynb)

    source_img = "/Users/jeongin/Desktop/แ„Œแ…กแ†ผแ„‹แ…ฅแ„€แ…ฎ์ด image"
    destin_img = "/Users/jeongin/Desktop/แ„Œแ…กแ†ผแ„‹แ…ฅแ„€แ…ฎแ„‹แ…ต/images"
    source_json = "/Users/jeongin/Desktop/แ„Œแ…กแ†ผแ„‹แ…ฅแ„€แ…ฎแ„‹แ…ต json"
    destin_json = "/Users/jeongin/Desktop/แ„Œแ…กแ†ผแ„‹แ…ฅแ„€แ…ฎแ„‹แ…ต/labels"
    
    
    file_list = os.listdir(source_img)
    image_list = [f for f in file_list if f.endswith('.jpg')]
    
    # ์ด๋ฏธ์ง€ ํŒŒ์ผ 500๊ฐœ ๋žœ๋ค ์ƒ˜ํ”Œ๋ง
    random_images = random.sample(image_list, 500)
    print("random",random_images)
    
    # ์ด๋ฏธ์ง€ ํŒŒ์ผ ๋ณต์‚ฌ ๋ถ™์—ฌ๋„ฃ๊ธฐ
    copied_files = []
    for image in random_images:
        copy_this = os.path.join(source_img, image)
        paste_here = os.path.join(destin_img, image)
        shutil.copy(copy_this, paste_here)
        copied_files.append(image)
    print("copied", copied_files)
    json_list = []
    
    for img in copied_files:
        new = img.replace('.jpg', '.json')
        json_list.append(new)
    print(json_list)
    
    for json in json_list:
        copy_from = os.path.join(source_json, json)
        paste_to = os.path.join(destin_json, json)
        shutil.copy(copy_from, paste_to)โ€‹

     

     

    ์ด ์ž‘์—…์„ ์™„๋ฃŒํ•˜๊ณ  ํด๋”๋ฅผ ์šฐํด๋ฆญํ•ด '์ •๋ณด ๊ฐ€์ ธ์˜ค๊ธฐ'๋ฅผ ํ•˜๋ฉด 501๊ฐœ์˜ ํŒŒ์ผ์ด ๋“ค์–ด์žˆ๋‹ค๊ณ  ๋œจ๋Š”๋ฐ, ์ด ํ•˜๋‚˜์˜ ์ •์ฒด๋Š” '.DS_Store' ๋ผ๋Š” ํŒŒ์ผ๋กœ macOS ์šด์˜ ์ฒด์ œ์—์„œ ์‚ฌ์šฉ๋˜๋Š” ์ˆจ๊ฒจ์ง„ ์‹œ์Šคํ…œ ํŒŒ์ผ์ด๋‹ค. ์ด ํŒŒ์ผ์€ ํŠน์ • ํด๋”์˜ ๋””๋ ‰ํ† ๋ฆฌ ๊ตฌ์กฐ์™€ ํด๋” ๋‚ด๋ถ€์˜ ํŒŒ์ผ ๋ฐ ํด๋”์˜ ๋ ˆ์ด์•„์›ƒ์„ ์ €์žฅํ•˜๋Š” ๋ฐ ์‚ฌ์šฉ๋˜๋ฉฐ ๋ฌด์‹œํ•ด๋„ ๋œ๋‹ค. 

    ํฌ๊ธฐ: 501๊ฐœ์˜ ํ•ญ๋ชฉ์ด ์žˆ๋‹ค๊ณ  ๋œฌ๋‹ค.

    ์œ„์˜ ๊ณผ์ •์„ ์„ ์ •ํ•œ 20๊ฐœ์˜ ์นดํ…Œ๊ณ ๋ฆฌ์— ๋Œ€ํ•ด ๋ฐ˜๋ณตํ•˜๋ฉฐ ์ด 500x20=10,000์žฅ์˜ ์‚ฌ์ง„๊ณผ ๋ผ๋ฒจ ํŒŒ์ผ์ด ์ €์žฅ๋œ๋‹ค.

    ๋žœ๋ค์œผ๋กœ ๋ฝ‘์€ 500๊ฐœ์˜ ์‚ฌ์ง„๊ณผ ๋ผ๋ฒจ์ค‘ 50, 50 ๊ฐœ๋ฅผ ๋–ผ์–ด๋‚ด์„œ test, validation ์— ๋‚˜๋ˆ ์ฃผ๋ฉด ๋œ๋‹ค.

    ์ด์ œ ๋‹ค์‹œ ์ƒˆ๋กœ์šด ํด๋”๋ฅผ ๋งŒ๋“ค์–ด ์ „์ฒด ์นดํ…Œ๊ณ ๋ฆฌ์— ๋Œ€ํ•ด train, validation, test ์…‹์œผ๋กœ ๋ถ„๋ฆฌ์‹œํ‚ฌ ๊ฒƒ์ด๋‹ค. 

    ๐Ÿ“ฆ all
    โ”œโ”€ train (8000)
    โ”‚  โ”œโ”€ images
    โ”‚  โ””โ”€ labels
    โ”œโ”€ validation (1000)
    โ”‚  โ”œโ”€ images
    โ”‚  โ””โ”€ labels
    โ””โ”€ test (1000)
       โ”œโ”€ images
       โ””โ”€ labels

     

    2) ์ด๋ฏธ์ง€ resize & json ํŒŒ์ผ ์ˆ˜์ •(label, resolution ํ‚ค ์ถ”๊ฐ€)

    ๋ชจ๋“  ์ด๋ฏธ์ง€๊ฐ€ ๊ฐ™์€ ์‚ฌ์ด์ฆˆ๋ฅผ ๊ฐ€์ง€๋„๋ก ๋ฆฌ์‚ฌ์ด์ง• ํ•ด์ค€๋‹ค. ๋ฆฌ์‚ฌ์ด์ง• ํ•ด์ƒ๋„๋Š” 300x300 ์ด๋‹ค.

    ์›๋ž˜ image ํŒŒ์ผ์€ ๊ฐ๊ธฐ ๋‹ค๋ฅธ ํ•ด์ƒ๋„๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ๋‹ค. ์ด์— ๋Œ€ํ•œ ์ •๋ณด๋ฅผ resolution ์œผ๋กœ ๋„ฃ์–ด์ค€๋‹ค.

    ๋˜ํ•œ label_map ์œผ๋กœ ๋งคํ•‘๋  label ์ˆซ์ž๋ฅผ ๋„ฃ์–ด์ค€๋‹ค.

    ์ฝ”๋“œ: 

    import os
    import random
    import shutil
    import json
    import cv2
    source_dir = "/Users/jeongin/Desktop/all/training/images"
    destin_dir = "/Users/jeongin/Desktop/all/training/destin_images"
    
    def resize_and_update_json(image, json_data, target_size=(300,300)):
        resized_image = cv2.resize(image, target_size)
        updated_json = []
        for obj in json_data:
            obj["label"] = 20 # ๋ฐ”๊ฟ”์ค„ ๋ถ€๋ถ„
            updated_json.append(obj)
        updated_json = [{'label': obj['label'], **obj} for obj in updated_json]
        resolution = {
            'width': image.shape[1],
            'height': image.shape[0]
        }
        return resized_image, updated_json, resolution
        
        
        
    for file in source_dir:
        # ์ด๋ฏธ์ง€ ํŒŒ์ผ ๋ณต์‚ฌ
        source_img = os.path.join(file)
        destin_img = os.path.join(file)
        
        # JSON ํŒŒ์ผ ๋ณต์‚ฌ
        json_file = os.path.splitext(file)[0] + ".json"
        source_json = os.path.join(source_dir, "labels", json_file)
        destin_json = os.path.join(destin_dir, "labels", json_file)
        
        # Load the image and JSON data
        image = cv2.imread(source_img)
        with open(source_json, 'r') as json_f:
            json_data = json.load(json_f)
        
        # Resize image and adjust bounding boxes
        resized_image, updated_json, resolution = resize_and_update_json(image, json_data)
        
        # Save the resized image and updated JSON data
        cv2.imwrite(destin_img, resized_image)
        
        # Group the updated object annotations and resolution into a single dictionary
        output_data = {
            'objects': updated_json,
            'resolution': resolution
        }
        
        with open(destin_json, 'w') as json_f:
            json.dump(output_data, json_f, indent=2)

     

    ์—…๋ฐ์ดํŠธ๋œ json ํŒŒ์ผ:

     

    3) ์ ˆ๋Œ€์ขŒํ‘œ๋กœ ๋ณ€ํ™˜ํ•ด์ค€๋‹ค. 

     

    ์›๋ž˜ json ํŒŒ์ผ ์•ˆ์—๋Š” ์ค‘์‹ฌ์ขŒํ‘œ (c_x, c_y)์™€ ํญ (width), ๋†’์ด(height) ๊ฐ€ ์ƒ๋Œ€์ขŒํ‘œ๋กœ ๋“ค์–ด์žˆ๋‹ค.

    ์ด๊ฒƒ์„ ์ถ”์ถœํ•ด์„œ ์ ˆ๋Œ€์ขŒํ‘œ๋กœ ๋„ฃ์–ด์ค„ ๊ฒƒ์ด๋‹ค. ์ ˆ๋Œ€์ขŒํ‘œ๋กœ ๋ณ€ํ™˜ํ•˜๋Š” ์ฝ”๋“œ๋Š” ์•„๋ž˜์™€ ๊ฐ™๋‹ค. 

    ์ฝ”๋“œ: 

    import os
    import json
    
    from PIL import Image, ImageDraw
    import torch
    import torchvision.transforms.functional as FT
    
    def convert(json_dir):
    
        for filename in os.listdir(json_dir):
            if filename.endswith(".json"):  
    
                json_file_path = os.path.join(json_dir, filename)  
    
                with open(json_file_path, 'r') as json_file:
                    data = json.load(json_file)
    
                for item in data["objects"]:
                    w = float(item["W"])
                    h = float(item["H"])
                    x, y = map(float, item["Point(x,y)"].split(','))
    
                    resized_width = 300
                    resized_height = 300
    
                    x_min = (x - w / 2) * resized_width
                    y_min = (y - h / 2) * resized_height
                    x_max = (x + w / 2) * resized_width
                    y_max = (y + h / 2) * resized_height
    
                    new_xmin = x_min
                    new_ymin = y_min
                    new_xmax = x_max
                    new_ymax = y_max
    
                    print(new_xmin, new_ymin, new_xmax, new_ymax)
    
                    item["boxes"] = [new_xmin, new_ymin, new_xmax, new_ymax]
    
                with open(json_file_path, 'w') as updated_json_file:
                    json.dump(data, updated_json_file, indent=4)
    
                # ์—…๋ฐ์ดํŠธ๋œ JSON ํŒŒ์ผ ๊ฒฝ๋กœ ์ถœ๋ ฅ
                print(f"์—…๋ฐ์ดํŠธ๋œ JSON ํŒŒ์ผ: {json_file_path}")
                
     # JSON ํŒŒ์ผ์ด ๋“ค์–ด ์žˆ๋Š” ๋””๋ ‰ํ† ๋ฆฌ ๊ฒฝ๋กœ
    json_dir = "/Users/jeongin/Desktop/all/train/labels"
    
    # JSON ํŒŒ์ผ ๋ณ€ํ™˜ ํ•จ์ˆ˜ ํ˜ธ์ถœ
    convert(json_dir)

    ์ตœ์ข… json ํŒŒ์ผ ํ˜•ํƒœ

    ํŒŒ๋ž€์ƒ‰ ๊ธ€์”จ ๋ถ€๋ถ„์ด ์ƒˆ๋กญ๊ฒŒ ์—…๋ฐ์ดํŠธ๋œ ๋ถ€๋ถ„์ด๋‹ค.


    4) train_images.json, train_objects.json, test_images.json, test_objects.json ๋งŒ๋“ค๊ธฐ

     

    ํ•™์Šต์„ ํ•  ๋•Œ ๋ฐ์ดํ„ฐ๋ฅผ ๋กœ๋“œํ•ด์™€์„œ ํ•™์Šต ํ•จ์ˆ˜์— ์ž…๋ ฅํ•ด์ฃผ๋ ค๋ฉด CustomDataset ํด๋ž˜์Šค ์ •์˜์™€ DataLoader ๊ฐ€ ํ•„์š”ํ•˜๋‹ค. 

    CustomDataset ์—์„œ ์ด๋ฏธ์ง€์™€ ๋ ˆ์ด๋ธ”์„ ๋ถˆ๋Ÿฌ์™€์•ผ ํ•œ๋‹ค. ์ด๋•Œ ํ•„์š”ํ•œ ๋ ˆ์ด๋ธ”์€ json ํŒŒ์ผ ๋‚ด๋ถ€์—์„œ "label" ๊ฐ’๊ณผ "boxes" ๋ฟ์ด๋‹ˆ, ํ•ด๋‹นํ•˜๋Š” key๋“ค๋งŒ ๊ฐ€์ ธ์™€์„œ ์ƒˆ๋กœ์šด 'train_objects.json' ํŒŒ์ผ์— ๋ฆฌ์ŠคํŠธ๋กœ ์ €์žฅํ•  ๊ฒƒ์ด๋‹ค.

     

    ๋˜ํ•œ, ๋ชจ๋“  ์ด๋ฏธ์ง€์˜ ์ ˆ๋Œ€ ๊ฒฝ๋กœ๋„ ์ƒˆ๋กœ์šด 'train_images.json' ํŒŒ์ผ์„ ๋งŒ๋“ค์–ด ๋ฆฌ์ŠคํŠธ๋กœ ์ €์žฅํ•˜๋ ค๊ณ  ํ•œ๋‹ค. ์ด๋ ‡๊ฒŒ ํ•˜๋ฉด ์ด๋ฏธ์ง€์˜ ๊ฒฝ๋กœ์™€, ๊ทธ ์ด๋ฏธ์ง€์— ๋Œ€ํ•œ label&boxes ๊ฐ€ ์ˆœ์„œ๋Œ€๋กœ ์ €์žฅ๋˜์–ด ์ธ๋ฑ์Šค๋กœ ๋ถˆ๋Ÿฌ์˜ฌ ์ˆ˜ ์žˆ์„ ๊ฒƒ์ด๋‹ค.

     

    < train_images.json ๊ณผ train_objects.json ๋งŒ๋“ค๊ธฐ >

    ์ดํ•˜ ๋ชจ๋“  ์ฝ”๋“œ๋Š” ๋ ˆํฌ์ง€ํ† ๋ฆฌ์˜ utils.py ๋‚ด๋ถ€์— ์กด์žฌํ•˜๋Š” ์ฝ”๋“œ์ด๋‹ค.

     

    1) label_map ์ง€์ •

    ๋จผ์ € label map ์„ ์ •์˜ํ•ด์•ผ ํ•œ๋‹ค. 20๊ฐœ์˜ ์นดํ…Œ๊ณ ๋ฆฌ๋ฅผ 1๋ถ€ํ„ฐ 20๊นŒ์ง€์˜ ๋ ˆ์ด๋ธ”๋กœ ์ง€์ •ํ•  ๊ฒƒ์ด๋‹ค.

    ๋ ˆ์ด๋ธ” 0์€ ์‚ฌ์ง„ ์†์— ์•„๋ฌด๋Ÿฐ ๊ฐ์ฒด๊ฐ€ ์—†๋Š” ๊ฒฝ์šฐ๋กœ 'background' ํด๋ž˜์Šค๋กœ ์ง€์ •ํ•  ๊ฒƒ์ด๋‹ค. 

    food_labels = ('mango', 'melon', 'strawberry', 'blueberry', 'apple', 'grapefruit',
                   'plum', 'peach', 'grape', 'cherry', 'yam', 'chestnuts', 'pepper',
                   'chicory', 'Kohlrabi', 'Paprika', 'Tomato', 'Mushroom', 'Pumpkin',
                   'Pimento')
    label_map = {k: v+1 for v, k in enumerate(food_labels)}
    
    label_map['background'] = 0
    print(len(label_map))
    rev_label_map = {v: k for k, v in label_map.items()}  # inverse mapping
    
    distinct_colors = ['#e6194b', '#3cb44b', '#ffe119', '#0082c8', '#f58231', 
    		'#911eb4', '#46f0f0', '#f032e6', '#d2f53c', '#fabebe', 
                    '#008080', '#000080', '#aa6e28', '#fffac8', '#800000', 
                    '#aaffc3', '#808000', '#ffd8b1', '#e6beff', '#808080', '#FFFFFF']
    label_color_map = {k: distinct_colors[i] for i, k in enumerate(label_map.keys())}
    
    print("label map:",label_map) # 'mango':
    21
    label_map: {
                'mango': 1,
                'melon': 2,
                'strawberry': 3,
                'blueberry': 4,
                'apple': 5,
                'grapefruit': 6,
                'plum': 7,
                'peach': 8,
                'grape': 9,
                'cherry': 10,
                'yam': 11,
                'chestnuts': 12,
                'pepper': 13,
                'chicory': 14,
                'Kohlrabi': 15,
                'Paprika': 16,
                'Tomato': 17,
                'Mushroom': 18,
                'Pumpkin': 19,
                'Pimento': 20,
                'background': 0
                }

    2) json ํŒŒ์ผ์˜ label๊ณผ boxes ํ‚ค์— ๋Œ€ํ•œ value ๊ฐ€์ ธ์˜ค๊ธฐ

    def parse_annotation(json_path):
        with open(json_path, 'r') as json_file:
            data = json.load(json_file)
        objects = data.get('objects', [])
        boxes = []
        labels = []
        for obj in objects:
            label = obj.get('label') + 1
            box = obj.get('boxes')
            if label is not None and box is not None:
                xmin, ymin, xmax, ymax = box
                boxes.append([xmin, ymin, xmax, ymax])
                labels.append(label)
        return {'labels': labels, 'boxes': boxes}

    json ํŒŒ์ผ ์•ˆ์—๋Š” ๋ ˆ์ด๋ธ”์ด 0~19 ๊นŒ์ง€์˜ ๋งคํ•‘๋Œ€๋กœ ๋ผ๋ฒจ๋ง๋˜์–ด ์žˆ์œผ๋‹ˆ train_objects.json ์— ๊ฐ€์ ธ์˜ฌ ๋•Œ๋Š” label์— 1์„ ๋”ํ•ด์„œ ๊ฐ€์ ธ์™€์•ผ ํ•œ๋‹ค. boxes๋Š” xmin, ymin, xmax, ymax ๋กœ ๊ทธ๋Œ€๋กœ ๊ฐ€์ ธ์˜จ๋‹ค. 

    ์ด์ œ ๊ฐ€์ ธ์˜จ ๊ฐ’๋“ค์„ ์ƒˆ๋กœ์šด json ํŒŒ์ผ์— write ํ•ด์„œ ์ €์žฅํ•˜๋ฉด ๋œ๋‹ค. ์ด๋ฏธ์ง€์˜ ๊ฒฝ๋กœ๋ฅผ ๊ฐ€์ ธ์˜ค๊ธฐ ์œ„ํ•ด์„œ ์ด๋ฏธ์ง€์˜ ๋ชจ๋“  ํŒŒ์ผ ์ œ๋ชฉ์ด ๋‹ด๊ฒจ์žˆ๋Š” train.txt ์™€ custom_path (๋ ˆํฌ์ง€ํ† ๋ฆฌ ์ ˆ๋Œ€ ๊ฒฝ๋กœ) ๋ฅผ ํ™œ์šฉํ•  ๊ฒƒ์ด๋‹ค. 

    def create_data_lists(custom_path, output_folder):
        train_images = list()
        train_objects = list()
        n_objects = 0
        # Training data
        for path in [custom_path]:
            with open(os.path.join(path, 'train.txt')) as f:
                ids = f.read().splitlines()
            for id in ids:
                objects = parse_annotation(os.path.join(path, 'train/labels', id + '.json'))
                if len(objects['boxes']) == 0:
                    continue
                n_objects += len(objects['labels'])  # Use objects['labels'] here
                train_objects.append(objects)
                train_images.append(os.path.join(path, 'train/images', id + '.jpg'))
        assert len(train_objects) == len(train_images)
    
        # Save to file
        with open(os.path.join(output_folder, 'train_images.json'), 'w') as j:
            json.dump(train_images, j)
        with open(os.path.join(output_folder, 'train_objects.json'), 'w') as j:
            json.dump(train_objects, j)
        print('\nThere are %d test images containing a total of %d objects. '
              'Files have been saved to %s.' % (
                  len(train_images), n_objects, os.path.abspath(output_folder)))
        print("length of test_objects", len(train_objects))
        print("length of test_images", len(train_images))

    ์ด์ œ ์ด ํ•จ์ˆ˜๋ฅผ ์‹คํ–‰ํ•˜๊ธฐ ์œ„ํ•ด custom_path ์— train๊ณผ test ํด๋”๋ฅผ ํฌํ•จํ•˜๊ณ  ์žˆ๋Š” ํ˜„์žฌ ๋ ˆํฌ์ง€ํ† ๋ฆฌ์˜ ๊ฒฝ๋กœ๋ฅผ ๋„ฃ์–ด์ฃผ๊ณ , output_folder์—๋Š” json ํŒŒ์ผ๋“ค์„ ์ €์žฅํ•  ๊ฒฝ๋กœ๋ฅผ ๋„ฃ์–ด์ค€๋‹ค. 

    from utils import create_data_lists
    
    if __name__ == '__main__':
        create_data_lists(custom_path = '/Users/jeongin/PycharmProjects/21class_food_detection',
                          output_folder='/Users/jeongin/PycharmProjects/21class_food_detection/train/json_files')

    test ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•ด์„œ๋„ ๋™์ผํ•˜๊ฒŒ create_data_list ํ•จ์ˆ˜๋ฅผ ์ ์šฉ์‹œ์ผœ ์ฃผ๋ฉด ๋œ๋‹ค. 

Designed by Tistory.