Legend:

7/17/25

Here’s a table summarizing all relevant JSON parameters for the ModelFactory refactor, with columns showing if each item is implemented in the OLD and NEW versions:

Parameter (JSON Path)DescriptionOLD ModelFactoryNEW ModelFactory (Planned) architecture.model_typeModel type selectorYESYES architecture.input_dimInput vector dimensionYESYES architecture.student_dimStudent (compressed) dimensionYESYES architecture.teacher_dimTeacher (output) dimensionYESYES architecture.pure_ln_config.vector_nativeOperate in latent space onlyPARTIALYES architecture.pure_ln_config.token_freeNo tokenizationPARTIALYES architecture.pure_ln_config.nuclear_compressionEnable nuclear compressionNOYES architecture.pure_ln_config.compression_stagesList of dynamic layers (type/in/out/dim)NOYES architecture.semantic_preservation.Loss weights, bottleneck, diversity enforcementNOYES (if needed by model) architecture.layer_specification.input_stageList of input layers (type/dim/source/params)NOYES architecture.layer_specification.compression_coreList of core layers (type/name/in/out/activation/params)NOYES architecture.layer_specification.output_stageList of output layers (type/name/in/out/dim/params)NOYES training.hyperparameters.loss_functionLoss function name (may affect model outputs)PARTIALYES (if needed by model) training.hyperparameters.loss_components.Loss component weights/methodsPARTIALYES (if needed by model) training.hyperparameters.learning_rateLearning rate (optimizer, not model)NONO training.hyperparameters.batch_sizeTraining batch size (pipeline, not model)NONO training.hyperparameters.optimizerOptimizer type (pipeline, not model)NONO training.hyperparameters.optimizer_params.Optimizer params (may affect model init, e.g. weight decay)NOPARTIAL training.hyperparameters.gradient_clippingGradient clipping (pipeline, not model)NONO training.hyperparameters.warmup_stepsScheduler/pipelineNONO training.scheduler.Scheduler config (pipeline, not model)NONO training.early_stopping.Early stopping config (pipeline, not model)NONO training.data.vector_preprocessing.Vector preprocessing (may affect normalization)PARTIALYES (if needed by model) Legend:

YES: Fully implemented or will be fully supported

PARTIAL: Only partially supported or handled in a limited way

NO: Not supported or not relevant for ModelFactory (handled by pipeline/trainer)

Key Improvements in the NEW ModelFactory:

Dynamic, config-driven layer construction (

compression_stages

layer_specification

)

Support for nuclear compression and diversity logic

Awareness of loss/semantic preservation settings for exposing intermediate outputs

Logging of full architecture and all relevant config at model creation

Let me know if you want a more granular breakdown, or if you’d like to see a sample config-to-model mapping for a specific architecture!

Table: Current Model Type Handling

Model Type (config)ModelFactory PathActual Model ClassConfig-driven?Supports Dynamic Layers?ASCII Logging?Notes ``


vector_only

_create_vector_modelStudentModelNoNoNoStatic, legacy


pure_vector

mapped to


vector_only

StudentModelNoNoNoAlias for


vector_only


pure_ln_vector_processor

_create_pure_ln_processorPureLNVectorProcessorYesYesYesNew, config-driven, preferred


latent_transformer

`_create_transformer_model`


LatentTransformerModel

???Not LN-native


vector_native_ln_testing

``(testing config, not model)(should be PureLN)YesYesYesShould use same code as above

Summary

They should all use the same model class:

PureLNVectorProcessor

They should all support config-driven, dynamic layer construction and verbose logging.

Testing should always use the same code path as training for LN-native validation.

Action Plan:

Unify all model type aliases (

vector_only

pure_vector

pure_ln_vector_processor

vector_native_ln_testing

) to use the

PureLNVectorProcessor code path in

ModelFactory.

Remove or deprecate the old

StudentModel path.

Ensure all relevant logging and dynamic layer support is available regardless of which alias is used in the config.

Table: Current Model Type Handling

Summary

Related Research

LNSP Architecture Analysis: Dimension Pathway Comparison

Self-Rehearsal Phase for Vector-Native Reinforcement in Large Vector Models (LVMs)

GPT vs LNSP Backpropagation Resource Comparison

Three LN Innovations