※ Kindly view this document on a PC. The mobile version may distort or break the table formatting.
표가 깨질 수 있으니, 정확한 확인을 위해 PC 버전으로 봐주시기 바랍니다. 모바일에서는 일부 내용이 정상적으로 표시되지 않을 수 있습니다.
Statistical Summary of IPs from Macro Clusters A–M
1. Scope of the dataset
- Clusters analysed: 13 (A through M)
- Time window: All logs delivered in ev 06 package
- Total log lines (entries): 2073
- Total unique IP : 650
- This memo is a descriptive wrap-up; deeper statistical modelling will follow in EV04.
2. Key findings
| # | Cluster║ | Total Entries║ | Unique IPs║ | In-cluster Duplicates║ | In-cluster uniqueness(unique ÷ entries)║ | Share of all uniques(unique ÷ 1 070) |
|---|---|---|---|---|---|---|
| 1 | A | 24 | 24 | 0 | 100 % | 2.2 % |
| 2 | B | 10 | 10 | 0 | 100 % | 0.9 % |
| 3 | C | 73 | 61 | 12 | 83.6 % | 5.7 % |
| 4 | D | 58 | 34 | 24 | 58.6 % | 3.2 % |
| 5 | E | 4 | 4 | 0 | 100 % | 0.4 % |
| 6 | F | 104 | 91 | 13 | 87.5 % | 8.5 % |
| 7 | G | 71 | 62 | 9 | 87.3 % | 5.8 % |
| 8 | H | 100 | 97 | 3 | 97.0 % | 9.1 % |
| 9 | I | 119 | 103 | 16 | 86.6 % | 9.6 % |
| 10 | J | 61 | 56 | 5 | 91.8 % | 5.2 % |
| 11 | K | 70 | 59 | 11 | 84.3 % | 5.5 % |
| 12 | L | 19 | 19 | 0 | 100 % | 1.8 % |
| 13 | M | 1 360 | 450 | 910 | 33.1 % | 42.1 % |
| Σ (A-M) | 2 073 | 1 070 (650) | 1 003 | 51.6 % (avg.) | 100 % |
- Cluster M dominates the pool, holding 42 % of every distinct IP observed but also showing the highest internal duplication (≈ 67 % of its lines repeat previously-seen IPs).
- Smaller clusters (A, B, E, L) have zero duplication and therefore 100 % in-cluster uniqueness, yet together contribute less than 6 % of global uniques.
- About one IP in two (51.6 %) across the entire log set is unique within its own cluster.
3. Unique IP List Statistical Summary
Top-300 most frequent / extracted for frequency analyses
- 92 % of all unique IPs fall within the top-300 prefixes.
- More than half of all duplicate occurrences are explained by just the top 10 prefixes.
- It seems that just a handful of network blocks may be dominating the dataset.
4. Observational notes
- This report presents the first-pass consolidation and cleansing of 2,073 log entries—containing 1,070 unique IP addresses—across 13 clusters (A through M).
- High-repeat sub-nets. Prefix 218.237 appears 35 times – the single most recurrent block – and is spread across seven clusters (C, D, F, G, H, I, M), hinting at either shared infrastructure or spoofing.
- Noise vs. signal. More than 1 000 duplicate rows originate mostly from M, D and F. Filtering those will sharpen future behavioural analysis.
* Appendix: Duplicate IP Graph for Clusters C and M Collected on June 24
URL : https://gall.dcinside.com/mgallery/board/view/?id=uspolitics&no=2113790
Archive: https://archive.md/JJJwj
댓글 0