YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

์ธตํ™” ๋ถ„๋ฆฌ๋œ ๋ฐ์ดํ„ฐ ๋ถ„ํ• 

๋ถ„ํ•  ๋ฐฉ๋ฒ•

  • ์ธตํ™” ๋ถ„๋ฆฌ (Stratified Split): industry ๋ผ๋ฒจ ๋ถ„ํฌ๋ฅผ ๊ณ ๋ คํ•˜์—ฌ ๋ถ„ํ• 
  • design_idx ๊ทธ๋ฃนํ™”: ๋™์ผํ•œ ๋””์ž์ธ์ด ์—ฌ๋Ÿฌ split์— ๋‚˜๋‰˜์ง€ ์•Š๋„๋ก ์ฒ˜๋ฆฌ
  • ๋น„์œจ: Train 70% / Val 10% / Test 20%
  • Random Seed: 42

ํ†ต๊ณ„

Design ์ˆ˜

  • Train: 44,022๊ฐœ designs
  • Val: 6,228๊ฐœ designs
  • Test: 12,736๊ฐœ designs
  • Total: 62,986๊ฐœ designs

๋ ˆ์ฝ”๋“œ ์ˆ˜

  • Train: 70,109๊ฐœ records
  • Val: 9,981๊ฐœ records
  • Test: 20,340๊ฐœ records
  • Total: 100,430๊ฐœ records

Industry ๋ถ„ํฌ (Train ์ƒ์œ„ 10๊ฐœ)

  • ๊ธฐ์—…/๋น„์ฆˆ๋‹ˆ์Šค/์ „๋ฌธ์„œ๋น„์Šค > ์ œ์กฐ/์ค‘๊ณต์—…/๊ธฐ๊ณ„/๊ธˆ์†: 3,645๊ฐœ (5.20%)
  • IT/ํ…Œํฌ > IT/์›น/๋ฐ์ดํ„ฐ: 2,602๊ฐœ (3.71%)
  • ๋ถ€๋™์‚ฐ/๊ฑด์ถ•/ํ™˜๊ฒฝ > ๊ฑด์ถ• > ๊ฑด์ถ•์„ค๊ณ„/์ธํ…Œ๋ฆฌ์–ด์‹œ๊ณต: 2,127๊ฐœ (3.03%)
  • ์—…์ข… ๋ฒ”์šฉ > ๊ธฐํš์•ˆ/๋ณด๊ณ ์„œ/์ œ์•ˆ์„œ: 2,000๊ฐœ (2.85%)
  • ์—…์ข… ๋ฒ”์šฉ > ์‹œ์„ค์•ˆ๋‚ด/์˜คํ”ผ์Šค๊ด€๋ฆฌ: 1,928๊ฐœ (2.75%)
  • ์˜๋ฃŒ/๊ฑด๊ฐ• > ๋ณ‘์›/์˜์›/์˜๋ฃŒ๊ธฐ๊ด€: 1,640๊ฐœ (2.34%)
  • ๊ณต๊ณต/๊ธฐ๊ด€ > ์ •๋ถ€/๊ณต๊ณต๊ธฐ๊ด€ > ์ค‘์•™์ •๋ถ€/์ง€์ž์ฒด: 1,628๊ฐœ (2.32%)
  • ๊ต์œก/์ปค๋ฆฌ์–ด > ํ•™์›/์˜จ๋ผ์ธ๊ต์œก/๊ธฐํƒ€ > ์ผ๋ฐ˜ํ•™์Šตํ•™์›: 1,574๊ฐœ (2.25%)
  • ์‹์Œ๋ฃŒ/์™ธ์‹ > ์‹์žฌ๋ฃŒ/์‹ํ’ˆํŒ๋งค > ๋†์‚ฐ/์ฒญ๊ณผ/์ž„์‚ฐ: 1,282๊ฐœ (1.83%)
  • ๋ถ€๋™์‚ฐ/๊ฑด์ถ•/ํ™˜๊ฒฝ > ํ™˜๊ฒฝ/์—๋„ˆ์ง€/ESG > ํ™˜๊ฒฝ์ •ํ™”/ํ๊ธฐ๋ฌผ: 1,228๊ฐœ (1.75%)

๊ฒ€์ฆ

๊ฐ split์˜ industry ๋ถ„ํฌ๊ฐ€ ์ „์ฒด ๋ฐ์ดํ„ฐ์˜ ๋ถ„ํฌ์™€ ์œ ์‚ฌํ•˜๊ฒŒ ์œ ์ง€๋ฉ๋‹ˆ๋‹ค.

์žฌํ˜„ ๋ฐฉ๋ฒ•

cd opensource
python scripts/stratified_split.py

์ƒ์„ฑ์ผ: 2026-03-10 ๋ฐฉ๋ฒ•: Stratified sampling by industry labels with design_idx grouping

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support