Efficient ဖြစ်တဲ့ Array ဘယ်လိုရေးမလဲ (in Swift)

Kyaw Zay Ya Lin Tun

4 min readOct 25, 2022

Prerequisite

Array
Value Vs Reference semantics
Big O notation

What I’ll cover

Array မိတ်ဆက်
reserveCapacity
Using value types in Array
Using Contiguous Array
နိဂုံး

Array မိတ်ဆက်

Array ဆိုတာ element တွေကို အစီအစဉ်တစ်ကျ သိမ်းဆည်းထားနိုင်တဲ့ collection တစ်ခုဖြစ်ပါတယ်။ Array ကို လူတိုင်း မသုံးဖူးသူ မရှိပါဘူး။ Element တွေရဲ့ order ကို သိချင်တဲ့အခါ duplicate ဖြစ်နိုင်ချေရှိတဲ့အခါမျိုးမှာ array ကိုသုံးပြီး သိမ်းဆည်းလေ့ရှိပါတယ်။ အပေါ်ကပြောတဲ့အချက်တွေ တစ်ခုမှမလိုရင်တော့ Set ကိုသုံးတာ ပိုပြီး performant ဖြစ်ပါတယ်။ Set အကြောင်းကတော့ ဒီနေ့ရဲ့ topic မဟုတ်တဲ့အတွက် ချန်လှပ်ထားခဲ့ပါရစေ။ ပြောရရင် Swift ရဲ့ array က အစတည်းက optimize ဖြစ်အောင် လုပ်ထားပြီးသားပါ။ ဒါပေမယ့် ကိုယ့်ရဲ့ array ကို ဒီထက်ပိုပြီး efficient ဖြစ်အောင် ဘာတွေထပ်လုပ်လို့ရမလဲ?

reserveCapacity

Array ရဲ့ memory ပေါ် data store လုပ်ပုံအကြောင်း တစ်ချက်ရှင်းပြပေးချင်ပါတယ်။ C, C++, Java တို့လို language တွေမှာ default အားဖြင့် array ရဲ့ capacity ကို dynamically တိုးလို့မရနိုင်ပါဘူး။ Swift လို language မျိုးမှာတော့ array ထဲကို element ကြိုက်သလောက် append လုပ်လို့ရပါတယ်။ ဒီလို append လုပ်နိုင်အောင်လည်း standard library ထဲမှာ code တွေသွားရေးထားတဲ့အတွက် ဖြစ်ပါတယ်။

‘array’ တစ်ခုရေးလိုက်ပါတယ်။ Elemnet သုံးခုရှိတဲ့အတွက် memory ပေါ်မှာ အခန်းသုံးခန်း store လုပ်နိုင်တဲ့ slot တစ်ခု ယူသွားမှာဖြစ်ပါတယ်။

array ထဲကို နောက်ထပ် element တစ်ခု append လုပ်လိုက်ပါတယ်။ ဒီအခါမှာ တစ်ကယ်က memory ပေါ်မှာ allocate လုပ်ထားတဲ့ capacity အရ သုံးခန်းပဲဆံ့တာပါ။ နောက်တစ်ခုထပ်တိုးလိုက်တော့ လေးခန်းဖြစ်သွားတဲ့အတွက် မဆံ့တော့ပါဘူး။ အဲ့အခါ အခန်းလေးခန်းဆံ့တဲ့ memory slot တစ်ခုထပ်ရှာပြီး ခုနက ရှိပြီးသား သုံးခန်းကို လေးခန်းဆန့်တဲ့ slot ထဲကို copy ယူပြီး အရင်လာထည့်ရပါတယ်။ ပြီးတော့မှ အသစ်ထပ်ထည့်မယ့် element value (‘4’ in our case) ကို append ပြန်လုပ်ရပါတယ်။ ဒီ copy ယူတဲ့ process တစ်ခုလုံးက O(n) ရှိပါတယ် (where n is the length of that array, which is ‘3’ in our example)။ Element အသစ်တစ်ခု append လုပ်တာက constant time O(1) ရှိပါတယ်။ Overall ပြန်တွက်ရရင် အပေါ်က array ထဲကို element အသစ် append လုပ်တဲ့အခါ linear time နဲ့ append လုပ်ရပါတယ်။ (O(n) + O(1) → O(n))

Swift language မှာ array ထဲကို element အသစ်တစ်ခါတိုးတိုင်း တစ်ကြိမ် reallocate လုပ်တာကို ရှောင်ရှားနိုင်ဖို့အတွက် array ရဲ့ capacity ကို exponentially increase လုပ်ပါတယ်။ ဉပမာ array အခန်းက စစချင်း သုံးခန်းရှိတယ် နောက်လေးခန်းမြောက် element ထည့်လိုက်တဲ့အခါ 3 x 2 → 6 ခန်းဆံ့တဲ့ memory slot ကို reallocate လုပ်တယ် နောက်ထပ် element တွေထပ်ထည့်လို့ ခြောက်ခန်းနဲ့မဆံ့တော့ပြန်ရင် 6 x 2 → 12 ခန်းဆံ့တဲ့ memory slot ကို ထပ် reallocate လုပ် အဲ့လို တစ်ခါ capacity limit ရောက်သွားတိုင်း လက်ရှိ capacity ကို နှစ်ဆတိုးတိုးပြီး memory allocate ပြောင်းလုပ်ပါတယ်။

တစ်ကယ်လို့များ ကိုယ့်ရဲ့ array မှာ element ဘယ်နှစ်ခုရှိမယ်ဆိုတာကို အတိအကျ ကြိုသိတယ်ဆိုရင် reserveCapacity(:) ကိုသုံးပြီး ခုနကလို expotential growth pattern ကြောင့်ဖြစ်လာမယ့် processing overhead အချို့ကို ကြိုရှောင်လို့ရတဲ့အတွက် ပိုပြီး performant ဖြစ်ပါတယ်။ သို့သော်လည်း array အခန်းဘယ်နှစ်ခန်းရှိမှန်း အတိအကျမသိနိုင်ဘူးဆိုရင်တော့ reserveCapacity(:) ကိုသုံးထားလည်း capacity exceed ဖြစ်သွားရင် ခုနကလို reallocate ပြန်လုပ်မှာဖြစ်တဲ့အတွက် သိပ်မထူးလှပါဘူး။

အထူးသဖြင့် leet code test တွေမှာဆိုရင်တော့ array တစ်ခု input ပေး အဲ့ကောင်ကို manipulate လုပ်ပြီး return ပြန်ခိုင်းတာမျိုးမှာ input array ရဲ့ capacity ကိုအတိအကျသိတဲ့အတွက် output array ကိုလည်း capacity အတိအကျနဲ့ memory ပေါ် allocate လုပ်ပြီး ပြန်ပေးလို့ရပါတယ်။ အဲ့လိုဆိုရင် linear space complexity ဖြစ်သွားပါပြီ။ ပိုကောင်းတာကတော့ extra space တစ်ခုမှမသုံးပဲ original array ကိုပဲ manipulate လုပ်ပြီး ပြန်ပေးနိုင်ရင်တော့ အတိုင်းထက်အလွန်ပါပဲ။

Using value types over reference types for Array Element

Swift မှာ types တွေကို value types နဲ့ reference types ဆိုပြီး နှစ်မျိုးရှိပါတယ်။ Struct, enum, tuple type တွေက value type မျိုးနွယ်ထဲမှာပါပြီး class type ကတော့ reference type ဖြစ်ပါတယ်။ Array တစ်ခုဆောက်လိုက်တာနဲ့ Swift compiler ကနေ Swift ရဲ့ Array ဆောက်ရမလား Objective C ရဲ့ NSArray ဆောက်ရမလားဆိုပြီး option နှစ်မျိုးကို စဉ်းစားရပါတယ်။ Array element တွေက reference type တွေဖြစ်ခဲ့ရင် NSArray အဖြစ်ပြန်ပြောင်းလို့ရပါတယ်။ အဲ့အတွက်ကြောင့် optimizer ကနေ NSArray properties ပြန်ပြောင်းနိုင်ချေကိုပါ ထည့်တွက်ထားရပါတယ်။ Value type ဖြစ်ခဲ့ရင်တော့ Swift ရဲ့ native array type တစ်ခုတည်းပဲဖြစ်နိုင်ချေရှိပါတယ်။ Objective C code ကို bridge လို့မရပါဘူး။ တစ်ကယ်လို့ ကိုယ့်ရဲ့ array က NSArray ရဲ့ func တွေသုံးစရာမလိုဘူး(များသောအားဖြင့် မလိုပါဘူး Objective C API တွေဆီကို expose လုပ်ရမယ်ဆိုရင်တော့ လိုနိုင်ပါတယ်)ဆိုရင်တော့ value type ကို သုံးလိုက်တာနဲ့ optimizer က NSArray ပြန် bridge ရမယ့် overhead တွေကို eliminate လုပ်ပေးနိုင်ပါတယ်။ ဒါ့အပြင် reference type တွေရဲ့ retain/release ကိစ္စတွေကိုလည်း ထည့်တွက်စရာမလိုတော့ပါဘူး။

သီအိုရီကို code ရေးပြီး စမ်းကြည့်ပါမယ်။ အရင်ဆုံး execution time ကို သိရဖို့အတွက် helper function လေးတစ်ခု ရေးထားပါမယ်။

ပြီးရင်တော့ class type Person နဲ့ struct type Address နှစ်ခု ရေးလိုက်ပါမယ်။

‘people’ ဆိုတဲ့ array နဲ့ ‘address’ ဆိုတဲ့ array နှစ်ခုထဲကို သက်ဆိုင်ရာ object အခု တစ်သန်း ထည့်ကြည့်ပါမယ်။

အပေါ်က code ကို ကျွန်တော့်စက်မှာ run ကြည့်တဲ့အခါ ဒီလို output ထွက်ပါတယ်။

တွက်ကြည့်ရင် 0.02 second နီးပါးလောက် သွားကွာပါတယ်။

တစ်ခုသတိပြုရမှာက အခုလို value type arry ထဲက element တွေအများကြီးကို index တစ်ခုကနေ နောက်တစ်ခုကို shift လုပ်တာ ရွှေ့တာပြောင်းတာတွေလုပ်မယ်ဆိုရင်တော့ reference type တွေလို COW လုပ်တာမဟုတ်ပဲ deep copy လုပ်ပြီး shift ရတာဖြစ်ပါတယ်။ ဒါကြောင့် ဒီလိုအခြေအနေမှာတော့ NSArray bridiging လုပ်တာနဲ့ retain/relese case တွေကို elimante လုပ်လိုက်တာကနေ ထွက်လာတဲ့ peformance benefits နဲ့မကာမိအောင် ပိုပြီး လေးသွားတာမျိုးလည်း ဖြစ်နိုင်ချေပါတယ်။

Keep in mind that there is a trade-off between using large value types and using reference types. In certain cases, the overhead of copying and moving around large value types will outweigh the cost of removing the bridging and retain/release overhead — Apple

Using Contiguous Array

စဉ်းစားကြည့်မယ်ဆိုရင် ဒီနေ့ရေးနေတဲ့ array တွေသည် NSArray အဖြစ် type cast ပြန်လုပ်ရနိုင်ချေ အင်မတန်နည်းပါတယ်။ Reference type သုံးရင် နောက်ကွယ်မှာ NSArray bridging လုပ်နေတာကြီးက အလကား CPU ကို အပိုအလုပ်လုပ်ခိုင်းနေသလိုပါပဲ။ ဒီလိုဆိုရင် reference element တွေပါတဲ့ array ကိုမှ NSArray bridge မလုပ်အောင်ရော လုပ်လို့မရဘူးလားလို့ မေးစရာရှိပါတယ်။ အဖြေကတော့ ContiguousArray type ကိုပြောင်းသုံးရင် ဖြစ်နိုင်ပါတယ်။ ContiguousArray ကလည်း RandomAccessCollection, RangeReplaceableCollection type တွေကို conform လုပ်ထားတဲ့အတွက် Array နဲ့ read/write performance အကုန် အတူတူပါပဲ။ ContiguousArray ကိုလည်း array literal syntax နဲ့ရေးလို့ရသလို func name တွေကလည်း(ကျွန်တော်သိသလောက်) array နဲ့ အတူတူပါပဲ။

If your array’s Element type is a class or @objc protocol and you do not need to bridge the array to NSArray or pass the array to Objective-C APIs, using ContiguousArray may be more efficient and have more predictable performance than Array. — Apple

ဒီတစ်ခါ Person class ကိုပဲ နည်းနည်းပိုပြီး data များသွားအောင် လုပ်လိုက်ပါမယ်။

ပြီးရင်တော့ ရိုးရိုး array နဲ့ contiguous array ထဲကို objct အခုတစ်သန်း ထည့်ကြည့်ပါမယ်။

အပေါ်က code ကို run ရင် ကျွန်တော့်စက်မှာ ဒီလို output ထွက်ပါတယ်။

ကျွန်တော့် M1 စက်ရဲ့ command line မှာ run တာတောင် 0.1s ကွာပါတယ်။ နောက် object creation ကလည်း ဘာ computation မှမပါပဲ ရိုးရိုးရှင်းရှင်း append လုပ်တာပါ။ တစ်ကယ့် real-world မှာ CPU resource challenging ဖြစ်တဲ့ Apple Watch လိုကောင်မျိုးမှာဆိုရင် ပိုသိသာမယ်ထင်ပါတယ်။

နိဂုံး

ဒီတစ်ခေါက်မှာတော့ Swift ရဲ့ Array ကို optimize လုပ်နိုင်မယ့်နည်းလမ်းလေးတွေ ပြောပြပေးခဲ့ပြီးပါပြီ။ Optimization ဆိုတာ တစ်ခါတလေ result ကြီးကြီးမာမားမမြင်ရပဲ တအား subtle ဖြစ်ပါတယ်။ ဒါပေမယ့် သေသေချာချာ profiler နဲ့ performance tuning လုပ်ကြည့်မယ်ဆိုရင် ကိုယ့် app ရဲ့ memory consumption, battery consumption, speed matrix စတာတွေ သိသိသာသာ တိုးတက်လာတယ်ဆိုတာ တွေ့ရမှာပါ။ Performance optimize လုပ်တယ်ဆိုတာမှာ ဉုံဖွဆိုပြီး အကုန်ကောင်းသွားတာထက် factor တစ်ခုခုကိုပိုကောင်းချင်လို့ နောက်ထပ် factor တစ်ခုနဲ့ trade-off လုပ်ရတယ်ဆိုတဲ့အချက်ကိုတော့ သတိပြုမိစေလိုပါတယ်။