7/17/25
Here’s a table summarizing all relevant JSON parameters for the ModelFactory refactor, with columns showing if each item is implemented in the OLD and NEW versions:
| Parameter (JSON Path) | Description | OLD ModelFactory | NEW ModelFactory (Planned) |
| architecture.model_type | Model type selector | YES | YES |
| architecture.input_dim | Input vector dimension | YES | YES |
| architecture.student_dim | Student (compressed) dimension | YES | YES |
| architecture.teacher_dim | Teacher (output) dimension | YES | YES |
| architecture.pure_ln_config.vector_native | Operate in latent space only | PARTIAL | YES |
| architecture.pure_ln_config.token_free | No tokenization | PARTIAL | YES |
| architecture.pure_ln_config.nuclear_compression | Enable nuclear compression | NO | YES |
| architecture.pure_ln_config.compression_stages | List of dynamic layers (type/in/out/dim) | NO | YES |
| architecture.semantic_preservation. | Loss weights, bottleneck, diversity enforcement | NO | YES (if needed by model) |
| architecture.layer_specification.input_stage | List of input layers (type/dim/source/params) | NO | YES |
| architecture.layer_specification.compression_core | List of core layers (type/name/in/out/activation/params) | NO | YES |
| architecture.layer_specification.output_stage | List of output layers (type/name/in/out/dim/params) | NO | YES |
| training.hyperparameters.loss_function | Loss function name (may affect model outputs) | PARTIAL | YES (if needed by model) |
| training.hyperparameters.loss_components. | Loss component weights/methods | PARTIAL | YES (if needed by model) |
| training.hyperparameters.learning_rate | Learning rate (optimizer, not model) | NO | NO |
| training.hyperparameters.batch_size | Training batch size (pipeline, not model) | NO | NO |
| training.hyperparameters.optimizer | Optimizer type (pipeline, not model) | NO | NO |
| training.hyperparameters.optimizer_params. | Optimizer params (may affect model init, e.g. weight decay) | NO | PARTIAL |
| training.hyperparameters.gradient_clipping | Gradient clipping (pipeline, not model) | NO | NO |
| training.hyperparameters.warmup_steps | Scheduler/pipeline | NO | NO |
| training.scheduler. | Scheduler config (pipeline, not model) | NO | NO |
| training.early_stopping. | Early stopping config (pipeline, not model) | NO | NO |
| training.data.vector_preprocessing. | Vector preprocessing (may affect normalization) | PARTIAL | YES (if needed by model) |
Legend:
YES: Fully implemented or will be fully supported
PARTIAL: Only partially supported or handled in a limited way
NO: Not supported or not relevant for ModelFactory (handled by pipeline/trainer)
Key Improvements in the NEW ModelFactory:
Dynamic, config-driven layer construction (
compression_stages
,
layer_specification
)
Support for nuclear compression and diversity logic
Awareness of loss/semantic preservation settings for exposing intermediate outputs
Logging of full architecture and all relevant config at model creation
Let me know if you want a more granular breakdown, or if you’d like to see a sample config-to-model mapping for a specific architecture!
Table: Current Model Type Handling
| Model Type (config) | ModelFactory Path | Actual Model Class | Config-driven? | Supports Dynamic Layers? | ASCII Logging? | Notes |
`` vector_only
` | _create_vector_model | StudentModel | No | No | No | Static, legacy |
` pure_vector
` | mapped to
` vector_only
` | StudentModel | No | No | No | Alias for
` vector_only
` |
` pure_ln_vector_processor
` | _create_pure_ln_processor | PureLNVectorProcessor | Yes | Yes | Yes | New, config-driven, preferred |
` latent_transformer
` | _create_transformer_model | ` LatentTransformerModel
` | ? | ? | ? | Not LN-native |
` vector_native_ln_testing
`` | (testing config, not model) | (should be PureLN) | Yes | Yes | Yes | Should use same code as above |
Summary
They should all use the same model class:
PureLNVectorProcessor
They should all support config-driven, dynamic layer construction and verbose logging.
Testing should always use the same code path as training for LN-native validation.
Action Plan:
Unify all model type aliases (
vector_only
,
pure_vector
,
pure_ln_vector_processor
,
vector_native_ln_testing
) to use the
PureLNVectorProcessor code path in
ModelFactory.
Remove or deprecate the old
StudentModel path.
Ensure all relevant logging and dynamic layer support is available regardless of which alias is used in the config.