-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[JSON FFI] Example Android Application using JSON FFI Engine #2322
Conversation
@@ -0,0 +1 @@ | |||
/build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let us make mlcenginexample independent from the MLCChat, as in iOS case. we can use separate packaging list via mlc-package-config.json
import android.util.Log | ||
import kotlin.concurrent.thread | ||
|
||
class MLCEngine (private val streamCallback: (String) -> Unit) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move MLCEngine to be be part of the mlc4j library (implement in java?), JSONFFIEngine should not be exposed to the users, and should be used as internal. Users should interact with MLCEngine.
(this can be a TODO) Ideally, we should not directly use stream callback, but instead put the requests into queues in stream callback, then allow chatCompletion to iterate over these queues.
jsonFFIEngine.unload() | ||
} | ||
|
||
fun chatCompletion(requestJSONStr: String) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rawChatCompletion, given we will change this in future
} | ||
} | ||
|
||
// private fun streamCallback(text: String) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
streamCallback
cpp/json_ffi/json_ffi_engine.cc
Outdated
@@ -146,8 +146,9 @@ class JSONFFIEngineImpl : public JSONFFIEngine, public ModuleNode { | |||
TVM_MODULE_VTABLE_ENTRY("exit_background_loop", &JSONFFIEngineImpl::ExitBackgroundLoop); | |||
TVM_MODULE_VTABLE_END(); | |||
|
|||
void InitBackgroundEngine(Device device, Optional<PackedFunc> request_stream_callback, | |||
Optional<EventTraceRecorder> trace_recorder) { | |||
void InitBackgroundEngine(int device_type, int device_id, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please send the JSON FFI changes as a separate PR
GlobalScope.launch { | ||
for (it in response) { | ||
responseText.value += it.choices[0].delta.content?.get(0)?.get("text") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pleaase chekc the latest vesion after #2331, in this case we changed the stream back to directly return the string content instead of the parts
val type: String, | ||
val schema: String? = null | ||
) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
new line at end of file
messages=listOf( | ||
ChatCompletionMessage( | ||
role="user", | ||
content= listOf( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add alternative overloaded constructor to ChatCompletionMessage so it can take string as content
@Serializable | ||
data class ChatCompletionMessage( | ||
val role: String, | ||
var content: List<Map<String, String>>? = null, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Introduce a ChatCompletionMessageContent class, with custom serialize method and two constructors that constructs from Text or Parts, reference swift https://github.com/mlc-ai/mlc-llm/blob/main/ios/MLCSwift/Sources/Swift/OpenAIProtocol.swift#L86
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One way to implement such union class can be
data class ChatCompletionMessageContent {
var kind: enum;
var data: Any;
};
We can use enum to denote the kind, then use helper functions toText, IsText to check the kind
add json ffif android example fix lint Refactor MLCEngineExample and MLCEngine.kt Use ChatCompletionMessageContent class ChatCompletionMessageContent: text and parts
…Decode in Android as List<ChatCompletionStreamResponse>
a0b2aa7
to
8e1102f
Compare
This PR adds a small example apk for loading and running phi-2 using JSON FFI backend.
Steps to run apk:
Copy the downloaded phi-2 model to the module file dir using
cp -r /storage/emulated/0/Android/data/ai.mlc.mlcchat/files/ /storage/emulated/0/Android/data/ai.mlc.mlcengineexample/files/
Change the model lib path to the custom
system-lib
used while compile the model library in file/mlc-llm/android/MLCChat/mlcengineexample/src/main/java/ai/mlc/mlcengineexample/MainActivity.kt
Build and run the
mlcengineexample
module.This was worked on collaboratively with @anibohara2000!
Additional Comments:
traceRecorder
argument forminit_background_engine
function calldevice
argument intodevice_type
anddevice_id