Linux內核同步機制mutex
mutex鎖概述
在linux內核中,互斥量mutex是一種保證CPU串行運行的睡眠鎖機制。和spinlock類似,都是同一個時刻只有一個線程進入臨界資源,不同的是,當無法獲取鎖的時候,spinlock原地自旋,而mutex則是選擇掛起當前線程,進入阻塞狀態。所以,mutex無法在中斷上下文中使用。
mutex鎖使用注意事項
- mutex一次只能有一個進程或線程持有該鎖
- mutex只有它的擁有者可以釋放該鎖
- 不能多次釋放同一把鎖
- 不可以重復獲取同一把鎖,否則會造成死鎖
- 必須使用mutex提供的專用初始化函數初始化該鎖
- 不能重復初始化同一把鎖
- 不能使用
memset
或memcpy
等內存處理函數初始化mutex鎖 - 線程退出時要釋放自己持有的所有mutex鎖
- 不能用于設備中斷或軟中斷上下文中
mutex鎖結構體定義
- owner:記錄mutex的持有者
- wait_lock:spinlock自旋鎖
- soq:MCS鎖隊列,用于支持mutex樂觀自旋機制
- wait_list:當無法獲取鎖的時候掛起在此
- magic:用于debug調試
- dep_map:用于debug調試
struct mutex {
atomic_long_t owner;
spinlock_t wait_lock;
#ifdef CONFIG_MUTEX_SPIN_ON_OWNER
struct optimistic_spin_queue osq; /* Spinner MCS lock */
#endif
struct list_head wait_list;
#ifdef CONFIG_DEBUG_MUTEXES
void *magic;
#endif
#ifdef CONFIG_DEBUG_LOCK_ALLOC
struct lockdep_map dep_map;
#endif
};
mutex鎖主要接口函數
mutex_init | 初始化mutex對象 |
---|---|
__mutex_init | mutex_init會調用此函數 |
DEFINE_MUTEX | 靜態定義并初始化一個mutex對象 |
__MUTEX_INITIALIZER | DEFINE_MUTEX會調用此函數 |
mutex_lock | 獲取mutex鎖,失敗進程進入D狀態 |
mutex_lock_interruptible | 獲取mutex鎖,失敗進入S狀態 |
mutex_trylock | 嘗試獲取mutex鎖,失敗直接返回 |
mutex_unlock | 釋放mutex鎖 |
mutex_is_locked | 判斷當前mutex鎖的狀態 |
獲取鎖流程分析
mutex_lock()
函數調用might_sleep()
函數判斷鎖的狀態,調用__mutex_trylock_fast()
函數嘗試快速獲取mutex
鎖,如果失敗,則調用__mutex_lock_slowpath()
函數獲取mutex
鎖
void __sched mutex_lock(struct mutex *lock)
{
might_sleep();
if (!__mutex_trylock_fast(lock))
__mutex_lock_slowpath(lock);
}
如果沒有定義CONFIG_DEBUG_ATOMIC_SLEEP
宏,might_sleep
函數退化為 might_resched()
函數。
# define might_sleep() \\
do { __might_sleep(__FILE__, __LINE__, 0); might_resched(); } while (0)
# define sched_annotate_sleep() (current- >task_state_change = 0)
#else
static inline void ___might_sleep(const char *file, int line,
int preempt_offset) { }
static inline void __might_sleep(const char *file, int line,
int preempt_offset) { }
# define might_sleep() do { might_resched(); } while (0)
# define sched_annotate_sleep() do { } while (0)
在配置了搶占式內核或者非搶占式內核的情況下,might_resched()
函數最終都是空函數。如果配置了主動搶占式內核CONFIG_PREEMPT_VOLUNTARY
,則might_resched()
函數會調用 _cond_resched()
函數來主動觸發一次搶占。
#ifdef CONFIG_PREEMPT_VOLUNTARY
extern int _cond_resched(void);
# define might_resched() _cond_resched()
#else
# define might_resched() do { } while (0)
#endif
#ifndef CONFIG_PREEMPT
extern int _cond_resched(void);
#else
static inline int _cond_resched(void) { return 0; }
#endif
——cond_resched()
函數調用should_resched()
函數判斷搶占計數器是否為0,如果搶占計數器為0并且設置了重新調度標記則調用preempt_schedule_common()
函數進行搶占式調度
#ifndef CONFIG_PREEMPT
int __sched _cond_resched(void)
{
if (should_resched(0)) {
preempt_schedule_common();
return 1;
}
return 0;
}
EXPORT_SYMBOL(_cond_resched);
#endif
__mutex_trylock_fast()
函數調用atomic_long_cmpxchg_acquire()
函數判斷lock->owner
的值是否等于0,如果等于0,則直接將當前線程的task struct
的指針賦值給lock->owner
,表示該mutex
鎖已經被當前線程持有。如果lock->owner
的值不等于0,則表示該mutex
鎖已經被其他線程持有或者鎖正在傳遞給top waiter
線程,當前線程需要阻塞等待。上面描述的操作(比較和賦值)都是原子操作,不會有任何指令插入。
static __always_inline bool __mutex_trylock_fast(struct mutex *lock)
{
unsigned long curr = (unsigned long)current;
if (!atomic_long_cmpxchg_acquire(&lock- >owner, 0UL, curr))
return true;
return false;
}
慢速獲取mutex
鎖的路徑就是__mutex_lock_common()
函數,所謂慢速其實就是阻塞當前線程,將current task
掛入mutex
的等待隊列的尾部。讓所有等待mutex
的任務按照時間的先后順序排列起來,當mutex
被釋放的時候,會首先喚醒隊首的任務,即最先等待的任務最先被喚醒。此外,在向空隊列插入第一個任務的時候,會給mutex flag
設置上MUTEX_FLAG_WAITERS
標記,表示已經有任務在等待這個mutex
鎖了。
static noinline void __sched
__mutex_lock_slowpath(struct mutex *lock)
{
__mutex_lock(lock, TASK_UNINTERRUPTIBLE, 0, NULL, _RET_IP_);
}
static int __sched
__mutex_lock(struct mutex *lock, long state, unsigned int subclass,
struct lockdep_map *nest_lock, unsigned long ip)
{
return __mutex_lock_common(lock, state, subclass, nest_lock, ip, NULL, false);
}
static __always_inline int __sched
__mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
struct lockdep_map *nest_lock, unsigned long ip,
struct ww_acquire_ctx *ww_ctx, const bool use_ww_ctx)
{
struct mutex_waiter waiter;
bool first = false;
struct ww_mutex *ww;
int ret;
might_sleep();
ww = container_of(lock, struct ww_mutex, base);
if (use_ww_ctx && ww_ctx) {
if (unlikely(ww_ctx == READ_ONCE(ww- >ctx)))
return -EALREADY;
}
preempt_disable();
mutex_acquire_nest(&lock- >dep_map, subclass, 0, nest_lock, ip);
if (__mutex_trylock(lock) ||
mutex_optimistic_spin(lock, ww_ctx, use_ww_ctx, NULL)) {
/* got the lock, yay! */
lock_acquired(&lock- >dep_map, ip);
if (use_ww_ctx && ww_ctx)
ww_mutex_set_context_fastpath(ww, ww_ctx);
preempt_enable();
return 0;
}
spin_lock(&lock- >wait_lock);
/*
* After waiting to acquire the wait_lock, try again.
*/
if (__mutex_trylock(lock)) {
if (use_ww_ctx && ww_ctx)
__ww_mutex_wakeup_for_backoff(lock, ww_ctx);
goto skip_wait;
}
debug_mutex_lock_common(lock, &waiter);
debug_mutex_add_waiter(lock, &waiter, current);
lock_contended(&lock- >dep_map, ip);
if (!use_ww_ctx) {
/* add waiting tasks to the end of the waitqueue (FIFO): */
list_add_tail(&waiter.list, &lock- >wait_list);
#ifdef CONFIG_DEBUG_MUTEXES
waiter.ww_ctx = MUTEX_POISON_WW_CTX;
#endif
} else {
/* Add in stamp order, waking up waiters that must back off. */
ret = __ww_mutex_add_waiter(&waiter, lock, ww_ctx);
if (ret)
goto err_early_backoff;
waiter.ww_ctx = ww_ctx;
}
waiter.task = current;
if (__mutex_waiter_is_first(lock, &waiter))
__mutex_set_flag(lock, MUTEX_FLAG_WAITERS);
set_current_state(state);
for (;;) {
/*
* Once we hold wait_lock, we're serialized against
* mutex_unlock() handing the lock off to us, do a trylock
* before testing the error conditions to make sure we pick up
* the handoff.
*/
if (__mutex_trylock(lock))
goto acquired;
/*
* Check for signals and wound conditions while holding
* wait_lock. This ensures the lock cancellation is ordered
* against mutex_unlock() and wake-ups do not go missing.
*/
if (unlikely(signal_pending_state(state, current))) {
ret = -EINTR;
goto err;
}
if (use_ww_ctx && ww_ctx && ww_ctx- >acquired > 0) {
ret = __ww_mutex_lock_check_stamp(lock, &waiter, ww_ctx);
if (ret)
goto err;
}
spin_unlock(&lock- >wait_lock);
schedule_preempt_disabled();
/*
* ww_mutex needs to always recheck its position since its waiter
* list is not FIFO ordered.
*/
if ((use_ww_ctx && ww_ctx) || !first) {
first = __mutex_waiter_is_first(lock, &waiter);
if (first)
__mutex_set_flag(lock, MUTEX_FLAG_HANDOFF);
}
set_current_state(state);
/*
* Here we order against unlock; we must either see it change
* state back to RUNNING and fall through the next schedule(),
* or we must see its unlock and acquire.
*/
if (__mutex_trylock(lock) ||
(first && mutex_optimistic_spin(lock, ww_ctx, use_ww_ctx, &waiter)))
break;
spin_lock(&lock- >wait_lock);
}
spin_lock(&lock- >wait_lock);
acquired:
__set_current_state(TASK_RUNNING);
mutex_remove_waiter(lock, &waiter, current);
if (likely(list_empty(&lock- >wait_list)))
__mutex_clear_flag(lock, MUTEX_FLAGS);
debug_mutex_free_waiter(&waiter);
skip_wait:
/* got the lock - cleanup and rejoice! */
lock_acquired(&lock- >dep_map, ip);
if (use_ww_ctx && ww_ctx)
ww_mutex_set_context_slowpath(ww, ww_ctx);
spin_unlock(&lock- >wait_lock);
preempt_enable();
return 0;
err:
__set_current_state(TASK_RUNNING);
mutex_remove_waiter(lock, &waiter, current);
err_early_backoff:
spin_unlock(&lock- >wait_lock);
debug_mutex_free_waiter(&waiter);
mutex_release(&lock- >dep_map, 1, ip);
preempt_enable();
return ret;
}