前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >ContentProvider 引发闪退之谜

ContentProvider 引发闪退之谜

原创
作者头像
QQ音乐技术团队
修改2017-10-16 11:09:41
6K2
修改2017-10-16 11:09:41
举报

ContentProvider(以下简称CP)是Android的四大组件之一,提供类似数据库增删查改的数据操作方式,同时还支持跨进程。CP在跨进程调用的场景中,作为数据提供的进程称作Server进程,请求数据的进程称作Client进程。当我们享受它在跨进程场景下带来的便利时,可能未曾想到Client进程存在被杀的隐患。

一、日志分析

代码语言:javascript
复制
06-06 21:57:52.892   916  2275 I ActivityManager: Start proc 26931:com.example.music/u0a103 for content provider com.example.music/.sharedfileaccessor.ContentProviderImpl
06-06 21:57:53.393   916   941 I ActivityManager: Process com.example.music (pid 26931) has died
06-06 21:57:53.423   916   941 I ActivityManager: Killing 16141:com.example.music:service/u0a103 (adj 2): depends on provider com.example.music/.sharedfileaccessor.ContentProviderImpl in dying proc com.example.music

这是Client进程被ActivityManagerService(以下简称AMS)杀死的3行关键日志:

  1. 第一行,CP的Server进程没有启动,所以AMS会先启动它;
  2. 第二行,Server进程启动后,某些原因(实际场景可能是LowMemoryKill)死掉了;
  3. 第三行,AMS因为CP的Server进程死了,所以杀死了CP的Client进程。

那么是什么样的工作原理,使得无辜的Client进程会被AMS杀死呢?这需要结合源代码进行分析。

二、清理已死进程的CP

首先,我们深入到Android源码(下文基于6.x版本),从”has died”的日志来看AMS1对于已经死亡的进程会做什么善后工作。

代码语言:javascript
复制
final void appDiedLocked(ProcessRecord app, int pid, IApplicationThread thread,
        boolean fromBinderDied) {
    // Clean up already done if the process has been re-started.
    if (app.pid == pid && app.thread != null &&
            app.thread.asBinder() == thread.asBinder()) {
        boolean doLowMem = app.instrumentationClass == null;
        boolean doOomAdj = doLowMem;
        if (!app.killedByAm) {
            // The "has died" log is printed here!!!
            Slog.i(TAG, "Process " + app.processName + " (pid " + pid
                    + ") has died");
            mAllowLowerMemLevel = true;
        } else {
            // Note that we always want to do oom adj to update our state with the
            // new number of procs.
            mAllowLowerMemLevel = false;
            doLowMem = false;
        }
        EventLog.writeEvent(EventLogTags.AM_PROC_DIED, app.userId, app.pid, app.processName);
        if (DEBUG_CLEANUP) Slog.v(TAG_CLEANUP,
            "Dying app: " + app + ", pid: " + pid + ", thread: " + thread.asBinder());
        handleAppDiedLocked(app, false, true);
        if (doOomAdj) {
            updateOomAdjLocked();
        }
        if (doLowMem) {
            doLowMemReportIfNeededLocked(app);
        }
    } else if (app.pid != pid) {
        // A new process has already been started.
        Slog.i(TAG, "Process " + app.processName + " (pid " + pid
                + ") has died and restarted (pid " + app.pid + ").");
        EventLog.writeEvent(EventLogTags.AM_PROC_DIED, app.userId, app.pid, app.processName);
    } else if (DEBUG_PROCESSES) {
        Slog.d(TAG_PROCESSES, "Received spurious death notification for thread "
                + thread.asBinder());
    }
}

在AMS的appDiedLocked()方法中,找到了”has died”日志的打印输出。然后,代码会运行到handleAppDiedLocked()方法。

代码语言:javascript
复制
private final void handleAppDiedLocked(ProcessRecord app,
        boolean restarting, boolean allowRestart) {
    int pid = app.pid;
    boolean kept = cleanUpApplicationRecordLocked(app, restarting, allowRestart, -1);
}

private final boolean cleanUpApplicationRecordLocked(ProcessRecord app,
        boolean restarting, boolean allowRestart, int index) {
    // Take care of any launching providers waiting for this process.
    if (cleanupAppInLaunchingProvidersLocked(app, false)) {
        restart = true;
    }
}

boolean cleanupAppInLaunchingProvidersLocked(ProcessRecord app, boolean alwaysBad) {
    // Look through the content providers we are waiting to have launched,
    // and if any run in this process then either schedule a restart of
    // the process or kill the client waiting for it if this process has
    // gone bad.
    boolean restart = false;
    for (int i = mLaunchingProviders.size() - 1; i >= 0; i--) {
        ContentProviderRecord cpr = mLaunchingProviders.get(i);
        if (cpr.launchingApp == app) {
            if (!alwaysBad && !app.bad && cpr.hasConnectionOrHandle()) {
                restart = true;
            } else {
                removeDyingProviderLocked(app, cpr, true);
            }
        }
    }
    return restart;
}

private final boolean removeDyingProviderLocked(ProcessRecord proc,
        ContentProviderRecord cpr, boolean always) {
    for (int i = cpr.connections.size() - 1; i >= 0; i--) {
        ContentProviderConnection conn = cpr.connections.get(i);
        if (conn.waiting) {
            // If this connection is waiting for the provider, then we don't
            // need to mess with its process unless we are always removing
            // or for some reason the provider is not currently launching.
            if (inLaunching && !always) {
                continue;
            }
        }
        //Got the information of the Client Process of this ContentProvider!!!
        ProcessRecord capp = conn.client;
        conn.dead = true;
        //This is an important checking, stableCount must large than 0.
        if (conn.stableCount > 0) {
            if (!capp.persistent && capp.thread != null
                    && capp.pid != 0
                    && capp.pid != MY_PID) {

                //This is exactly where the Client Process is killed!!!
                capp.kill("depends on provider "
                        + cpr.name.flattenToShortString()
                        + " in dying proc " + (proc != null ? proc.processName : "??")
                        + " (adj " + (proc != null ? proc.setAdj : "??") + ")", true);
            }
        } else if (capp.thread != null && conn.provider.provider != null) {
        }
    }
}

经过一层一层的方法调用链条:handleAppDiedLocked() -> cleanUpApplicationRecordLocked() -> cleanupAppInLaunchingProvidersLocked() -> removeDyingProviderLocked(),终于找到了Client进程被杀死了的地方,并且打印输出的日志也完全吻合。

不过,即使在最终的 removeDyingProviderLocked() 方法里面,要走到杀死Client进程的代码,也是要经过一层层的条件判断。其中最关键的是,conn.stableCount > 0。那么,ContentProviderConnection(以下简称CPC)的stableCount什么时候增,什么时候减?

三、CPC的stableCount计数增加

stableCount的增加在AMS的incProviderCountLocked()方法。在AMS中,方法调用链是AMS.getContentProvide() -> AMS.getContentProviderImpl() -> AMS.incProviderCountLocked():

代码语言:javascript
复制
ContentProviderConnection incProviderCountLocked(ProcessRecord r,
        final ContentProviderRecord cpr, IBinder externalProcessToken, boolean stable) {
    if (r != null) {
        for (int i=0; i<r.conProviders.size(); i++) {
            ContentProviderConnection conn = r.conProviders.get(i);
            if (conn.provider == cpr) {
                if (stable) {
                    //The stableCount is increased here!!!
                    conn.stableCount++;
                    conn.numStableIncs++;
                }
            }
        }
        ContentProviderConnection conn = new ContentProviderConnection(cpr, r);
        if (stable) {
            //If there is no target ContentProvider found in conProviders, then create a new instance. And initialize the stableCount to 1.
            conn.stableCount = 1;
            conn.numStableIncs = 1;
        }
        cpr.connections.add(conn);
        r.conProviders.add(conn);
    }
}

private ContentProviderHolder getContentProviderImpl(IApplicationThread caller,
        String name, IBinder token, boolean stable, int userId) {
    synchronized(this) {
        boolean providerRunning = cpr != null && cpr.proc != null && !cpr.proc.killed;
        if (providerRunning) {
            conn = incProviderCountLocked(r, cpr, token, stable);
        }

        if (!providerRunning) {
            conn = incProviderCountLocked(r, cpr, token, stable);
        }
        checkTime(startTime, "getContentProviderImpl: done!");
    }
}

@Override
public final ContentProviderHolder getContentProvider(
        IApplicationThread caller, String name, int userId, boolean stable) {
    return getContentProviderImpl(caller, name, null, stable, userId);
}

AMS.getContentProvider()方法会在ActivityThread2(以下简称AT)里面被调用到。

代码语言:javascript
复制
public final IContentProvider acquireProvider(
        Context c, String auth, int userId, boolean stable) {
    try {
        holder = ActivityManagerNative.getDefault().getContentProvider(
                getApplicationThread(), auth, userId, stable);
    } catch (RemoteException ex) {
        throw ex.rethrowFromSystemServer();
    }
    return holder.provider;
}

AT.acquireProvider()方法会在ContextImpl3里面被调用。

代码语言:javascript
复制
private static final class ApplicationContentResolver extends ContentResolver {
    private final ActivityThread mMainThread;
    private final UserHandle mUser;

    public ApplicationContentResolver(
            Context context, ActivityThread mainThread, UserHandle user) {
        super(context);
        mMainThread = Preconditions.checkNotNull(mainThread);
        mUser = Preconditions.checkNotNull(user);
    }

    @Override
    protected IContentProvider acquireProvider(Context context, String auth) {
        return mMainThread.acquireProvider(context,
                ContentProvider.getAuthorityWithoutUserId(auth),
                resolveUserIdFromAuthority(auth), true);
    }
}

private ContextImpl(ContextImpl container, ActivityThread mainThread,
        LoadedApk packageInfo, IBinder activityToken, UserHandle user, int flags,
        Display display, Configuration overrideConfiguration, int createDisplayWithId) {
    //Create a new instance in the constructor.
    mContentResolver = new ApplicationContentResolver(this, mainThread, user);
}

@Override
public ContentResolver getContentResolver() {
    return mContentResolver;
}

ContextImpl正是Android应用开发经常打交道的Context的实现类。在它的构造方法中,会实例化一个mContentResolver,用于getContentResolver()方法调用的时候返回,而这个方法是我们使用ContentProvider的时候,一定会用到的。

mContentResolver的类型是ApplicationContentResolver(以下简称ACR),它是ContentResolver4(以下简称CR)的实现类。在ACR实现的acquireProvider()方法,直接返回的是AT.acquireProvider()。ACR.acquireProvider()方法在CR.acquireProvider()方法中会被调用:

代码语言:javascript
复制
public final IContentProvider acquireProvider(Uri uri) {
    if (!SCHEME_CONTENT.equals(uri.getScheme())) {
        return null;
    }
    final String auth = uri.getAuthority();
    if (auth != null) {
        // calls the abstract method, which is implemented in the ApplicationContentResolver
        return acquireProvider(mContext, auth);
    }
    return null;
}

CR的每一个增删查改的方法里面,acquireProvider()方法和releaseProvider()方法都是成对出现的。

代码语言:javascript
复制
public final @Nullable Uri insert(@RequiresPermission.Write @NonNull Uri url,
            @Nullable ContentValues values) {
    IContentProvider provider = acquireProvider(url);
    try {
    } catch (RemoteException e) {
    } finally {
        releaseProvider(provider);
    }
}

public final int delete(@RequiresPermission.Write @NonNull Uri url, @Nullable String where,
        @Nullable String[] selectionArgs) {
    IContentProvider provider = acquireProvider(url);
    try {
    } catch (RemoteException e) {
    } finally {
        releaseProvider(provider);
    }
}

public final @Nullable Cursor query(final @RequiresPermission.Read @NonNull Uri uri,
        @Nullable String[] projection, @Nullable String selection,
        @Nullable String[] selectionArgs, @Nullable String sortOrder,
        @Nullable CancellationSignal cancellationSignal) {
    try {
        try {
            qCursor = unstableProvider.query(mPackageName, uri, projection,
                    selection, selectionArgs, sortOrder, remoteCancellationSignal);
        } catch (DeadObjectException e) {
            stableProvider = acquireProvider(uri);
        }
    } catch (RemoteException e) {
        // Arbitrary and not worth documenting, as Activity
        // Manager will kill this process shortly anyway.
        return null;
    } finally {
        if (stableProvider != null) {
            releaseProvider(stableProvider);
        }
    }
}

四、CPC的stableCount计数减少

CR.releaseProvider()方法的调用很有可能会使得stableCount的计数减少,下面我们要继续追踪代码来作证明。 ACR.releaseProvider()方法是直接调用了AT.releaseProvider()方法。

代码语言:javascript
复制
private static final class ApplicationContentResolver extends ContentResolver {
    @Override
    public boolean releaseProvider(IContentProvider provider) {
        return mMainThread.releaseProvider(provider, true);
    }
}

果然,在AT.releaseProvider()方法里面,stableCount被减1了。

代码语言:javascript
复制
public final boolean releaseProvider(IContentProvider provider, boolean stable) {
    synchronized (mProviderMap) {
        if (stable) {
            prc.stableCount -= 1;
        }
    }
}

至此,我们已经了解,AMS杀死CP的Client进程的工作原理:CR的方法调用过程中,Server进程死了,那么AMS在清理Server进程的CP时候,对于stableCount > 0的CP的Client进程会被kill掉。

五、另一种AMS杀死Client进程的场景

AMS.getContentProviderImpl()方法里,如果发现CP的Server进程未启动,会调用startProcessLocked()启动Server进程。然后调用incProviderCountLocked()方法,增加stableCount的计数。当Server进程启动,会调用AMS.attachApplicationLocked()方法。我们主要关注它里面2个逻辑:

  • 发送一个延迟10秒的CONTENT_PROVIDER_PUBLISH_TIMEOUT_MSG的消息给mHandler;
  • 调用ActivityThread.bindApplication();
代码语言:javascript
复制
private ContentProviderHolder getContentProviderImpl(IApplicationThread caller,
        String name, IBinder token, boolean stable, int userId) {
    synchronized(this) {
        boolean providerRunning = cpr != null && cpr.proc != null && !cpr.proc.killed;
        if (!providerRunning) {
            // If the provider is not already being launched, then get it
            // started.
            if (i >= N) {
                try {
                    ProcessRecord proc = getProcessRecordLocked(
                            cpi.processName, cpr.appInfo.uid, false);
                    if (proc != null && proc.thread != null && !proc.killed) {
                    } else {
                        proc = startProcessLocked(cpi.processName,
                                cpr.appInfo, false, 0, "content provider",
                                new ComponentName(cpi.applicationInfo.packageName,
                                        cpi.name), false, false, false);
                    }
                }
            }
            //increase stableCount
            conn = incProviderCountLocked(r, cpr, token, stable);
        }
    }
}

private final boolean attachApplicationLocked(IApplicationThread thread,
        int pid) {
    if (providers != null && checkAppInLaunchingProvidersLocked(app)) {
        Message msg = mHandler.obtainMessage(CONTENT_PROVIDER_PUBLISH_TIMEOUT_MSG);
        msg.obj = app;
        mHandler.sendMessageDelayed(msg, CONTENT_PROVIDER_PUBLISH_TIMEOUT);
    }
    try {
        ProfilerInfo profilerInfo = profileFile == null ? null
                : new ProfilerInfo(profileFile, profileFd, samplingInterval, profileAutoStop);
        thread.bindApplication(processName, appInfo, providers, app.instrumentationClass,
                profilerInfo, app.instrumentationArguments, app.instrumentationWatcher,
                app.instrumentationUiAutomationConnection, testMode,
                mBinderTransactionTrackingEnabled, enableTrackAllocation,
                isRestrictedBackupMode || !normalMode, app.persistent,
                new Configuration(mConfiguration), app.compat,
                getCommonServicesLocked(app.isolated),
                mCoreSettingsObserver.getCoreSettingsLocked());
    }
}

CONTENT_PROVIDER_PUBLISH_TIMEOUT_MSG消息的处理,会调用AMS.processContentProviderPublishTimedOutLocked() -> AMS.cleanupAppInLaunchingProvidersLocked()。后者这个方法,前文介绍过了,当判断CPC.stableCount > 0,AMS会杀死Client进程。

代码语言:javascript
复制
final class MainHandler extends Handler {
    public MainHandler(Looper looper) {
        super(looper, null, true);
    }

    @Override
    public void handleMessage(Message msg) {
        switch (msg.what) {
        case CONTENT_PROVIDER_PUBLISH_TIMEOUT_MSG: {
            ProcessRecord app = (ProcessRecord)msg.obj;
            synchronized (ActivityManagerService.this) {
                processContentProviderPublishTimedOutLocked(app);
            }
        } break;
        }
    }
}

private final void processContentProviderPublishTimedOutLocked(ProcessRecord app) {
    cleanupAppInLaunchingProvidersLocked(app, true);
    removeProcessLocked(app, false, true, "timeout publishing content providers");
}

只有10秒钟的倒计时,我们看看AT.bindApplication()如何救赎。bindApplication()方法会发送一个BIND_APPLICATION的消息个Handler。

代码语言:javascript
复制
private class ApplicationThread extends ApplicationThreadNative {
    public final void bindApplication(String processName, ApplicationInfo appInfo,
            List<ProviderInfo> providers, ComponentName instrumentationName,
            ProfilerInfo profilerInfo, Bundle instrumentationArgs,
            IInstrumentationWatcher instrumentationWatcher,
            IUiAutomationConnection instrumentationUiConnection, int debugMode,
            boolean enableBinderTracking, boolean trackAllocation,
            boolean isRestrictedBackupMode, boolean persistent, Configuration config,
            CompatibilityInfo compatInfo, Map<String, IBinder> services, Bundle coreSettings) {
        sendMessage(H.BIND_APPLICATION, data);
    }
}

AT的Handler处理消息,会调用handleBindApplication()方法。

代码语言:javascript
复制
            case BIND_APPLICATION:
                handleBindApplication(data);
                break;

AT.handleBindApplication() -> AT.installContentProviders(),后者方法里面,有2处我们需要关注的:

  • installProvider()调用了ContentProvider.attachInfo();
  • 它调用了AMS.publishContentProviders()方法;
代码语言:javascript
复制
private void handleBindApplication(AppBindData data) {
    try {
        if (!data.restrictedBackupMode) {
            if (!ArrayUtils.isEmpty(data.providers)) {
                installContentProviders(app, data.providers);
            }
        }
    }
}

private void installContentProviders(
        Context context, List<ProviderInfo> providers) {
            final ArrayList<IActivityManager.ContentProviderHolder> results =
        new ArrayList<IActivityManager.ContentProviderHolder>();

    for (ProviderInfo cpi : providers) {
        IActivityManager.ContentProviderHolder cph = installProvider(context, null, cpi,
                false /*noisy*/, true /*noReleaseNeeded*/, true /*stable*/);
        if (cph != null) {
            cph.noReleaseNeeded = true;
            results.add(cph);
        }
    }

    try {
        ActivityManagerNative.getDefault().publishContentProviders(
            getApplicationThread(), results);
    } catch (RemoteException ex) {
        throw ex.rethrowFromSystemServer();
    }
}

private IActivityManager.ContentProviderHolder installProvider(Context context,
        IActivityManager.ContentProviderHolder holder, ProviderInfo info,
        boolean noisy, boolean noReleaseNeeded, boolean stable) {
    ContentProvider localProvider = null;
    IContentProvider provider;
    if (holder == null || holder.provider == null) {
        try {
            //attachInfo calls the ContentProvider.onCreate() method
            localProvider.attachInfo(c, info);
        }
    }
}

我们看CP.attachInfo()方法,会调用到CP.onCreate()方法,这是一个抽象方法。当我们自定义实现CP的时候,需要实现这个方法。

代码语言:javascript
复制
public void attachInfo(Context context, ProviderInfo info) {
    attachInfo(context, info, false);
}

private void attachInfo(Context context, ProviderInfo info, boolean testing) {
    mNoPerms = testing;

    /*
     * Only allow it to be set once, so after the content service gives
     * this to us clients can't change it.
     */
    if (mContext == null) {
        ContentProvider.this.onCreate();
    }
}

public abstract boolean onCreate();

在AMS.publishContentProviders()方法,终于找到了拆解定时炸弹的钥匙。CONTENT_PROVIDER_PUBLISH_TIMEOUT_MSG被remove了!

代码语言:javascript
复制
public final void publishContentProviders(IApplicationThread caller,
        List<ContentProviderHolder> providers) {
    synchronized (this) {
        final int N = providers.size();
        for (int i = 0; i < N; i++) {
            if (dst != null) {
                if (wasInLaunchingProviders) {
                mHandler.removeMessages(CONTENT_PROVIDER_PUBLISH_TIMEOUT_MSG, r);
                }
            }
        }
    }
}

所以,AMS留了10秒钟,给CP的Server进程启动和CP做准备工作(其中包括了CP.onCreate()),否则Client进程避免不了被杀的命运。

六、总结

我们选择ContentProvider作为跨进程通信的方案时,要把Client进程被杀死的情况考虑在内,因为这看似不可完全避免。

七、参考

理解ContentProvider原理: http://gityuan.com/2016/07/30/content-provider/ ContentProvider引用计数: http://gityuan.com/2016/05/03/content_provider_release/

八、源码

frameworks/base/services/core/java/com/android/server/am/ActivityManagerService.java frameworks/base/core/java/android/app/ActivityThread.java frameworks/base/core/java/android/app/ContextImpl.java frameworks/base/core/java/android/content/ContentResolver.java

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 一、日志分析
  • 二、清理已死进程的CP
  • 三、CPC的stableCount计数增加
  • 四、CPC的stableCount计数减少
  • 五、另一种AMS杀死Client进程的场景
  • 六、总结
  • 七、参考
  • 八、源码
相关产品与服务
Elasticsearch Service
腾讯云 Elasticsearch Service(ES)是云端全托管海量数据检索分析服务,拥有高性能自研内核,集成X-Pack。ES 支持通过自治索引、存算分离、集群巡检等特性轻松管理集群,也支持免运维、自动弹性、按需使用的 Serverless 模式。使用 ES 您可以高效构建信息检索、日志分析、运维监控等服务,它独特的向量检索还可助您构建基于语义、图像的AI深度应用。
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档
http://www.vxiaotou.com