Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[JSON FFI] Example Android Application using JSON FFI Engine #2322

Merged
merged 2 commits into from
May 18, 2024

Conversation

Kartik14
Copy link
Contributor

This PR adds a small example apk for loading and running phi-2 using JSON FFI backend.

Steps to run apk:

  1. Copy the downloaded phi-2 model to the module file dir using
    cp -r /storage/emulated/0/Android/data/ai.mlc.mlcchat/files/ /storage/emulated/0/Android/data/ai.mlc.mlcengineexample/files/

  2. Change the model lib path to the custom system-lib used while compile the model library in file /mlc-llm/android/MLCChat/mlcengineexample/src/main/java/ai/mlc/mlcengineexample/MainActivity.kt

  3. Build and run the mlcengineexample module.

This was worked on collaboratively with @anibohara2000!

Additional Comments:

  • Removed the traceRecorder argument form init_background_engine function call
  • Split device argument into device_type and device_id

@@ -0,0 +1 @@
/build
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let us make mlcenginexample independent from the MLCChat, as in iOS case. we can use separate packaging list via mlc-package-config.json

import android.util.Log
import kotlin.concurrent.thread

class MLCEngine (private val streamCallback: (String) -> Unit) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move MLCEngine to be be part of the mlc4j library (implement in java?), JSONFFIEngine should not be exposed to the users, and should be used as internal. Users should interact with MLCEngine.

(this can be a TODO) Ideally, we should not directly use stream callback, but instead put the requests into queues in stream callback, then allow chatCompletion to iterate over these queues.

jsonFFIEngine.unload()
}

fun chatCompletion(requestJSONStr: String) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rawChatCompletion, given we will change this in future

}
}

// private fun streamCallback(text: String) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

streamCallback

@@ -146,8 +146,9 @@ class JSONFFIEngineImpl : public JSONFFIEngine, public ModuleNode {
TVM_MODULE_VTABLE_ENTRY("exit_background_loop", &JSONFFIEngineImpl::ExitBackgroundLoop);
TVM_MODULE_VTABLE_END();

void InitBackgroundEngine(Device device, Optional<PackedFunc> request_stream_callback,
Optional<EventTraceRecorder> trace_recorder) {
void InitBackgroundEngine(int device_type, int device_id,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please send the JSON FFI changes as a separate PR

GlobalScope.launch {
for (it in response) {
responseText.value += it.choices[0].delta.content?.get(0)?.get("text")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pleaase chekc the latest vesion after #2331, in this case we changed the stream back to directly return the string content instead of the parts

val type: String,
val schema: String? = null
)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new line at end of file

messages=listOf(
ChatCompletionMessage(
role="user",
content= listOf(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add alternative overloaded constructor to ChatCompletionMessage so it can take string as content

@Serializable
data class ChatCompletionMessage(
val role: String,
var content: List<Map<String, String>>? = null,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Introduce a ChatCompletionMessageContent class, with custom serialize method and two constructors that constructs from Text or Parts, reference swift https://github.com/mlc-ai/mlc-llm/blob/main/ios/MLCSwift/Sources/Swift/OpenAIProtocol.swift#L86

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One way to implement such union class can be

data class ChatCompletionMessageContent {
   var kind: enum;
   var data: Any;
};

We can use enum to denote the kind, then use helper functions toText, IsText to check the kind

Kartik14 and others added 2 commits May 17, 2024 21:43
add json ffif android example

fix lint

Refactor MLCEngineExample and MLCEngine.kt

Use ChatCompletionMessageContent class

ChatCompletionMessageContent: text and parts
…Decode in Android as List<ChatCompletionStreamResponse>
@tqchen tqchen merged commit 96fc289 into mlc-ai:main May 18, 2024
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants