Changes between Version 15 and Version 16 of DsxDocumentation
- Timestamp:
- Jan 30, 2008, 10:32:42 AM (17 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
DsxDocumentation
v15 v16 5 5 == A) Goals and general principles == 6 6 7 DSX stands for ''Design Space eXplorer''. It helps the system designer to map a multi-threaded software application7 DSX stands for ''Design Space Explorer''. It helps the system designer to map a multi-threaded software application 8 8 on a multi-processor hardware architecture (MP-SoC) modeled with the SoCLib components. 9 9 … … 14 14 15 15 A specific goal of DSX is to allow the system designer to control not only the placement of the 16 tasks on the processors, but the placement of the software objects (execution stacks,16 tasks on the processors, but also the placement of the software objects (execution stacks, 17 17 communication buffers, synchronization locks, etc.) on the memory banks. In shared memory multi-processors 18 18 architectures with several physically distributed memory banks, such control is mandatory to optimize 19 19 both the performances and the power consumption. 20 20 21 The two targeted application domains are the telecommunication applications (where the tasks are handling packets or packet descriptors), and multi -media applications (where the tasks are handling audio or video streams).22 23 The general principles of the DSX toolare the following:21 The two targeted application domains are the telecommunication applications (where the tasks are handling packets or packet descriptors), and multimedia applications (where the tasks are handling audio or video streams). 22 23 The general principles of DSX are the following: 24 24 * The coarse grain parallelism of the software application must be statically defined as a '''Task & Communications Graph (TCG)'''. The number of tasks, and the communication channels between tasks should not change during execution. 25 25 * The software tasks are supposed to be written in C or C++, but - for portability reasons - the tasks must use an abstract '''System Resource Layer (SRL)''' API to access the communication and synchronizations resources. 26 * Each task in the TCG can be implemented as a '''software task''' (software running on an embedded processor), or can be implemented as an '''hardware task''' ,(running as a dedicated hardware coprocessor).27 * DSX allows the programmer to use unprotected shared memory spaces, but the prefered inter-tasks communication mechanism use the '''MWMR middleware'''. The MWMR (Multi-Writer, Multi-Reader)communication channels, are implemented as software FIFOs and can be shared by ''software tasks'',and by ''hardware tasks''.26 * Each task in the TCG can be implemented as a '''software task''' (software running on an embedded processor), or can be implemented as an '''hardware task''' (running as a dedicated hardware coprocessor). 27 * DSX allows the programmer to use unprotected shared memory regions, but the prefered inter-tasks communication mechanism use the '''MWMR middleware'''. The MWMR (Multi-Writer, Multi-Reader) communication channels, are implemented as software FIFOs and can be shared by ''software tasks'' and by ''hardware tasks''. 28 28 * DSX provides classical synchronization mechanisms such as '''barriers''' and '''locks''', but inter-task synchronisation is mainly done through the data availability in the MWMR channels. 29 * The target hardware architecture is a '''shared memory multi-processor system on chip''' (MP-SoC) using the SoCLib library of IP cores. But - in order to validate the multi-threaded software application -DSX is able to generate an executable binary code for a standard POSIX workstation.29 * The target hardware architecture is a '''shared memory multi-processor system on chip''' (MP-SoC) using the SoCLib library of IP cores. In order to validate the multi-threaded software application, DSX is able to generate an executable binary code for a standard POSIX workstation. 30 30 * DSX supports the '''POSIX''' compliant [https://www-asim.lip6.fr/trac/mutekh Mutek] OS kernel for embedded MPSoCs 31 * Finally,DSX defines the '''DSX/L''' language, based on PYTHON, that allows the system designer to describe in a single file the Task & Communication Graph (TCG), the MP-SoC hardware architecture, and various mapping of the TCG on the MP-Soc architecture.31 * DSX defines the '''DSX/L''' language, based on PYTHON, that allows the system designer to describe in a single file the Task & Communication Graph (TCG), the MP-SoC hardware architecture, and various mapping of the TCG on the MP-Soc architecture. 32 32 33 33 The DSX/L script execution generates the binary code executable on the workstation, the 34 SystemC model of the ''top cell''correspondint to the MP-SoC architecture, and the binary34 simulator correspondint to the MP-SoC architecture, and the binary 35 35 code that will be uploaded in the MP-Soc embedded memory. 36 36 … … 55 55 * flush a MWMR channel 56 56 * '''srl_mwmr_flush( )''' 57 57 58 * Synchronization barrier 58 59 * '''srl_barrier_wait( )''' 60 59 61 * taking and releasing a lock 60 * '''srl_lo ock_lock( )'''62 * '''srl_lock_lock( )''' 61 63 * '''srl_lock_unlock( )''' 64 62 65 * accessing a shared memory space (address and size) 63 66 * '''srl_memspace_addr( )''' 64 67 * '''srl_memspace_size( )''' 65 68 66 Three platforms are presently supported :69 Three platforms are currently supported : 67 70 * Any Linux (or Unix) workstation supporting the POSIX threads, 68 * MP-SoC architecture using the M UTEK/D operation system,69 * MP-SoC architecture using the M UTEK/S operating system,70 71 M UTEK/D is an embedded, POSIX compliant, distributed, operating system for MP-SoCs,72 while M UTEK/S is an optimized version: the performances are improved, and the memory71 * MP-SoC architecture using the Mutek/D operation system, 72 * MP-SoC architecture using the Mutek/S operating system, 73 74 Mutek/D is an embedded, POSIX compliant, distributed, operating system for MP-SoCs, 75 while Mutek/S is an optimized version: the performances are improved, and the memory 73 76 footprint is reduced, at the cost of loosing the POSIX compatibility. 74 77 … … 82 85 [[Image(MjpegCourse:mjpeg.png)]] 83 86 84 The two TG & RAMDAC tasks will be implemented as hardware coprocessors : the TG component implements a wire -less receiver for the MJPEG stream, and the RAMDAC component is a graphic display controller.87 The two TG & RAMDAC tasks will be implemented as hardware coprocessors : the TG component implements a wireless receiver for the MJPEG stream, and the RAMDAC component is a graphic display controller. 85 88 The 5 other tasks can be implemented as ''software tasks'' or as ''hardware tasks''. In this particular example, 86 89 all MWMR communication channels have one single producer, and one 87 single consumer, which is frequent for stream oriented multi -media applications.90 single consumer, which is frequent for stream oriented multimedia applications. 88 91 89 92 === C1) Task Model definition === 90 93 91 As a software application can instanciate several instances of the same task, we must distinguish the task, and the task model. A task model defines the code associated to the task, and the task interface (corresponding to the system resources used by the task : MWMR communications channels, synchronization barriers, locks, and memspaces).94 As a software application can have several instances of the same task, we must distinguish the task, and the task model. A task model defines the code associated to the task, and the task interface corresponding to the system resources used by the task (MWMR communications channels, synchronization barriers, locks, memspaces, ...). 92 95 {{{ 93 96 task_model = TaskModel( 'model_name', … … 97 100 barriers = [ 'barrier_name', ... ] , 98 101 memspaces = [ 'memspace_name', ... ] , 99 signals = [ 'signal_name', ... ] ,100 102 impls = [ SwTask( 'func', stack_size = 1024 , sources = [ 'func.c' ] ) 101 103 }}} … … 138 140 1. ''lock'' : lock protecting exclusive access 139 141 140 === C4) Memspace definition 142 === C4) Memspace definition === 141 143 142 144 Direct communication through shared memory buffers is supported by DSX, but there is no protection mechanism, and the synchronization is the programmer responsability. … … 158 160 my_lock = Lock( 'lock_name' ) 159 161 }}} 160 In the mapping section of the DSX/L program, the lock can be explicitely placed in the memory space. 162 In the mapping section of the DSX/L program, 1 software object must be placed : 163 1. ''lock'' : Where to place the lock 161 164 162 165 === C6) Task instanciation === … … 175 178 1. ''run'' : processor running the task 176 179 177 === C 8) TCG definition ===180 === C7) TCG definition === 178 181 179 182 The Task and Communication Graph must be defined : … … 196 199 === D1) SoCLib components === 197 200 198 In the present version of DSX, each hardware component must be described by a PYTHON201 In the current version of DSX, each hardware component must be described by a Python 199 202 class that defines the component interface, and the component parameters. 200 203 The list of available components can be found in SoclibComponents. … … 221 224 Depending on the component type, the port designation can vary: 222 225 * When the number of ports is fixed, the ports are attributs : My_Proc0.cache define the cache port of the MIPS processor. 223 * When the number of port is not fixed (typivally for interconnect component, the ports are a ccessed through a dedicated method : the getTarget() method of the !LocalCrossbar component returns a VCI targetport.226 * When the number of port is not fixed (typivally for interconnect component, the ports are allocated through a dedicated method : the getTarget() method of the !LocalCrossbar component allocates a VCI target port, the getInit() method allocates an VCI Initiator port. 224 227 The following example describes asimple system with two processor and on e embedded memory: 225 228 {{{ … … 250 253 251 254 In any shared memory architecture, the address space is a shared resource. This resource is structured in several segments. A segment has a name, a base address, a size 252 (number of bytes), and a cacheability attribut (Boolean). A segment is a physical entity associated to a255 (number of bytes), and a cacheability attribute (boolean). A segment is a physical entity associated to a 253 256 given VCI target. Several segments can be associated to the same VCI target, but a given segment cannot be distributed over several VCI targets. 254 257 … … 263 266 264 267 # Instanciating a VCI target hardware component 265 # and Linking the segments to this component268 # and assigning the segments to this component 266 269 my_ram = MultiRam ( 'ram', seg_data1, seg_data2, seg_reset ) 267 270 }}} … … 278 281 === D4) Generic platforms === 279 282 280 As DSX/L is based on P YTHON, it is possible to define generic, parametrized architectutes, that can283 As DSX/L is based on Python, it is possible to define generic, parametrized architectutes, that can 281 284 be reused for various applications. Those reusable architectures are derived classes 282 285 from the basic '''Architecture''' class. The implementation is defined in the architecture() method. 283 286 284 As an example we define a parameterized multi-processors architecture, called MultiProc, and containing 285 a variable number of processors. The parameter(s) must be named, and the actual parameter value is defined when the architecture is instanciated. The parameter is referenced with the ''getParam()'' method, and it is possible to define a default value. 287 As an example we define a parameterized multi-processors architecture, called !MultiProc, and containing 288 a variable number of processors. The parameter(s) must be named, and the actual parameter value is defined 289 when the architecture is instanciated. The parameter is referenced with the ''getParam()'' method, and it 290 is possible to define a default value. 286 291 {{{ 287 292 ################################# … … 292 297 def architecture(self): 293 298 294 # segments definition295 self.reset = Segment( ’reset’, address = 0xbfc00000, type = Cached )296 self.code = Segment( ’code’, type = Cached )297 self.data = Segment( ’data’, type = Uncached )298 299 # components instanciation and connexion300 self.vgmn = Vgmn( ’vgmn’ )301 self.ram = MultiRam( ’ram’, self.reset, self.code, self.data )302 # processors and caches303 self.cpus = []304 for i in self.getParam( ’nbcpu’ ):305 m = Mips( ’mips%d’%i )306 self.cpus.append( m )307 c = Xcache( ’cache%d’%i )308 g:c.cache // m.cache )309 c.vci // self.vgmn.getTarget() )310 self.vgmn.getTarget() // self.c1311 self.vgmn.getTarget() // self.c2312 self.vgmn.getInit() // self.ram313 314 # base definition315 self.setBase( self.vgmn )316 317 # segment table initialization318 self.setConfig(’mapping_table’, MappingTable() )299 # segments definition 300 self.reset = Segment( ’reset’, address = 0xbfc00000, type = Cached ) 301 self.code = Segment( ’code’, type = Cached ) 302 self.data = Segment( ’data’, type = Uncached ) 303 304 # components instanciation and connexion 305 self.vgmn = Vgmn( ’vgmn’ ) 306 self.ram = MultiRam( ’ram’, self.reset, self.code, self.data ) 307 # processors and caches 308 self.cpus = [] 309 for i in self.getParam( ’nbcpu’ ): 310 m = Mips( ’mips%d’%i ) 311 self.cpus.append( m ) 312 c = Xcache( ’cache%d’%i ) 313 g:c.cache // m.cache ) 314 c.vci // self.vgmn.getTarget() ) 315 self.vgmn.getTarget() // self.c1 316 self.vgmn.getTarget() // self.c2 317 self.vgmn.getInit() // self.ram 318 319 # base definition 320 self.setBase( self.vgmn ) 321 322 # segment table initialization 323 self.setConfig(’mapping_table’, MappingTable() ) 319 324 320 325 #################################### … … 331 336 === E1) Mapper declaration === 332 337 333 A Sit is possible to define various mapping for a given TCG, and a given architecture, we must define a third object : this ''mapper'' will contain all the mapping directives defined by the system designer.338 As it is possible to define various mapping for a given TCG, and a given architecture, we must define a third object : this ''mapper'' will contain all the mapping directives defined by the system designer. 334 339 {{{ 335 340 my_mapper = Mapper( my_tcg, my_architecture ) … … 339 344 340 345 The mapper has a method ''map()'' that is used to assign a software object to an hardware component. 341 An hardware component can b a processor, ora segment associated to an embedded memory bank,346 An hardware component can be a processor, a segment associated to an embedded memory bank, 342 347 or a segment associated to an addressable peripheral. 343 348 {{{ … … 367 372 to various outputs : binary code for the software application, hardware architecture simulation model, etc. 368 373 369 374 This involves a code generator. Several code generator exist, they may apply to different parts of you design: 375 * Software only (Tcg object) 376 * Posix() for generating native workstation code 377 * Software and hardware (Mapper object) 378 * MutekS() to use Mutek/S as supporting embedded OS 379 * MutekD() to use Mutel/D as supporting embedded OS 380 * any hardware generator (those on next lines), this will create a platform automatically loading the embedded software 381 * Hardware only (Hardware object) 382 * Caba() to create a CABA netlist (with SoCLib) 383 * Tlmt() to create a TLM-T netlist (with SoCLib) 384 385 User may want to have a convenience Makefile in platform root which would build all code, 386 it may be created passing all generators created to generate code to TopMakefile() 387 388 Example: Let's create 389 * An application mapped on an hardware platform with CABA and TLM-T simulators 390 * a corresponding application for the workstation 391 * a top Makefile: 392 393 {{{ 394 395 soft = Tcg( ... ) 396 hard = Hardware( ... ) 397 398 mapping = Mapper( hard, soft ) 399 400 mapping.map( ... ) 401 402 # Generators now: 403 404 muteks_generator = MutekS() 405 caba_generator = Caba() 406 tlmt_generator = Tlmt() 407 408 posix_generator = Posix() 409 410 # MutekS and simulators (Caba / Tlmt) generates platform and embedded software for a mapping: 411 412 mapping.generate( muteks_generator, caba_generator, tlmt_generator ) 413 414 # Posix generates code for a Tcg 415 416 tcg.generate( posix_generator ) 417 418 # TopMakefile takes the used generators: 419 420 TopMakefile( muteks_generator, caba_generator, tlmt_generator, posix_generator ) 421 }}} 422