<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>TechGuri &#187; High Level Synthesis</title>
	<atom:link href="http://www.techguri.com/category/high-level-synthesis/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.techguri.com</link>
	<description>Technical blog EDA, semiconductor industry</description>
	<lastBuildDate>Thu, 29 Jul 2010 15:03:53 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
		<item>
		<title>Unlock the Low-Power Design Puzzle with Algorithmic Synthesis</title>
		<link>http://www.techguri.com/2009/09/14/unlock-the-low-power-design-puzzle-with-algorithmic-synthesis/</link>
		<comments>http://www.techguri.com/2009/09/14/unlock-the-low-power-design-puzzle-with-algorithmic-synthesis/#comments</comments>
		<pubDate>Tue, 15 Sep 2009 00:48:58 +0000</pubDate>
		<dc:creator>Fernando Martinez</dc:creator>
				<category><![CDATA[High Level Synthesis]]></category>
		<category><![CDATA[Low Power]]></category>
		<category><![CDATA[Algorithmic Synthesis]]></category>
		<category><![CDATA[HLS]]></category>

		<guid isPermaLink="false">http://www.techguri.com/?p=554</guid>
		<description><![CDATA[Low-power design techniques have been around for quite a while and have until recently been looked upon as a set of nice optimizations to have, not a tape-out requirement in many application domains. The rapid growth of the consumer market for handheld devices and growing awareness about environmental impact from power usage has changed this. [...]]]></description>
			<content:encoded><![CDATA[<p>Low-power design techniques have been around for quite a while and have until recently been looked upon as a set of nice optimizations to have, not a tape-out requirement in many application domains. The rapid growth of the consumer market for handheld devices and growing awareness about environmental impact from power usage has changed this. Today, low-power design is not a feature; it is a requirement for gaining/keeping market share. The problem now is how to use all the techniques in the low-power design puzzle to minimize power consumption and still meet a design schedule.</p>
<p>Before looking at how Algorithmic Synthesis accelerates low-power design, let&#8217;s take a look at the spectrum of alternatives to achieve low power. Analysis and optimization of a design&#8217;s power footprint can start as early as the system level specification or as late as the physical layout. While design changes to reduce power can occur at any stage of the design cycle, the amount of effort and the power reduction are inversely proportional to each other. The figure below helps illustrate this point.</p>
<p><img class="aligncenter size-large wp-image-558" src="http://www.techguri.com/wp-content/uploads/2009/09/power-1023x453.png" alt="Power Optimization Opportunities During the Design Cycle" width="741" height="328" /></p>
<p style="text-align: center;">Figure 1: Power Reduction Opportunities Across Stages of a Design Cycle</p>
<p style="text-align: left;">It can be seen from Figure 1, that the closer a design is to the gate level, the harder it is to make changes that will reduce power consumption. At the same time, the maximum possible reduction in power consumption decreases the closer a design is to the final gate level implementation. There are several reasons which explain this phenomenon. Primarily the inverse relationship between effort and power savings can be summarized by the following constraints at the RTL level</p>
<ol>
<li>Once the RTL is functionally complete and in verification, touching the RTL for power savings is a complex and risky approach which can delay tape-out. Once a design is in verification, the most common type of RTL change has to do with functional correctness, not with design improvements.</li>
<li>Opportunities to change the power profile on an algorithm once RTL is complete are very limited. Maybe a logic equation can be further optimized here or there, but it is too late to do architectural changes with real impact on the power profile. For example, changing the memory representation of data can have a large impact on power. The problem is that this change usually involves an algorithm change, which will not be taken up at the RTL level.</li>
<li>Once the RTL is complete, it is typically too late in the design cycle to consider advanced design techniques such as clock gating. Without clock gating, a design can easily leave 50% more power on the table than is needed for functional correctness.</li>
</ol>
<p>When looking at Figure 1 and the constraints at the RTL level, it is clear that the solution for low-power has to be applied earlier in the design cycle. The engineer has to move to a higher level of abstraction to gain some freedom in the design cycle to test out algorithmic variations for their impact on the power profile required for functionality. One way of increasing the level of abstraction, is to move the design capture from RTL to C by using Algorithmic Synthesis (AS) tools.</p>
<p>AS refers to a class of hardware design tools, which raise the level of abstraction for design capture from RTL to a programmatic language such as C. One of the key advantages of AS tools is that efficient hardware implementations are derived from untimed, sequential C algorithms. The allows the designer to focus on the algorithm, while at the same time be shielded from the error-prone steps involved in writing/verifying RTL. While all AS tools offer an increased level of design abstraction when compared to RTL, they do not all provide the same level of capabilites to enable power reduction and optimization of an algorithm. From the field of AS tools, PICO Extreme Power from Synfora is the first to automatically optimize power consumption at both the system and the architecture level by using a variety of techniques such as multi-level clock gate insertion. As shown by Figure 1, there are clear benefits to tackling power consumption at the system and architecture level instead of the transistor and layout levels.</p>
<p>Another conclusion which can be drawn from Figure 1 is that the higher abstraction level for design capture, the faster it is to test and verify different power saving strategies. With this in mind, the question is how can a technique such as multi-level clock gating be efficiently used on a design captured in an AS tool?</p>
<p>The answer to this question requires the explanation of a more basic concept. What is clock-gating and what are it&#8217;s benefits? The basic premise of clock gating is that portions of a computational datapath can be turned on and off depending on dynamic processing requirements by shutting off sections of the clock tree network. While the concept is simple, it&#8217;s implementation is actually quite complex. Effective use of clock gating requires</p>
<ul>
<li>Fine grain knowledge about the schedule of sections of a datapath/blocks relative to other elements in the design. One common mistake with clock-gating is to turn-off a block or datapath section without taking into account the downstream effects of that decision, which leads to dead-locks.</li>
<li>Increased verification effort and complexity to cover all the cases when a block may be inactive and turned off. The verification team also has to take into account the cases where the block is turned on again. Both the shutdown and startup of a clock gated element must be tested to occur only in a safe state of the circuit operation.</li>
</ul>
<p>While clock-gating has the potential of delivering significant power reduction in a given design, the complexity associated with the verification of this technique prohibits many tradiational hand-written RTL flows from utilizing it. An AS tool like PICO Extreme Power, solves the problems associated with designing clock-gated hardware through automation. In the case of the PICO solution, the tool is in complete control of the RTL being generated. This means that PICO has complete knowledge of block inactive/active states, and of cross block dependencies which affect the clock gating implementation. Without affecting how the user creates the design in the AS tool, automatic clock gating insertion happens at the following levels:</p>
<ul>
<li><strong>Coarse-grain</strong>: Automatic startup and shutdown of large portions of a design from the top-level module. At this level, the AS tools has to guarantee both functional correctness and the correctness of the control logic associated with clock gating. The correctness of the clock-gating has to be verified both statically and through simulation to provide the user with confidence in the correctness of the solution.</li>
<li><strong>Fine-grain</strong>: Even if an entire block can not be turned-off, portions of that block can be. The AS tool should detect this possibility, creat the appropriate control logic and the verification infrastructure to prove correct operation. One way of enabling fine-grain clock gating is through the use of multi-level hierarchical design using a TCAB design methodology. TCABs will be discussed in more detail in a follow-up posting.</li>
</ul>
<p>In addition to inserting the clock gating circuits at different levels of the design hierarchy, the AS tool needs to verify the correct sequencing of all clock and clock enable signals. Without a verification component as part of any automated clock gating solution, the power savings achieved by this technique will be overshadowed by the manual effort in verifying the correctness of the circuit. Like in a traditional hand design RTL flow, clock gating is a powerful technique, but it will not be used if the verification burden is high.</p>
<p>Unlocking the low-power design puzzle requires a combination of techniques, which can be readily applied at the C algorithmic level. In addition to the classical approaches in AS tools such as architectural exploration and algorithmic changes, clock gating is an important tool in minimizing power consumption.</p>
<p><img src="/DOCUME~1/fernando/LOCALS~1/Temp/moz-screenshot.jpg" alt="" /></p>
]]></content:encoded>
			<wfw:commentRss>http://www.techguri.com/2009/09/14/unlock-the-low-power-design-puzzle-with-algorithmic-synthesis/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title></title>
		<link>http://www.techguri.com/2009/08/28/529/</link>
		<comments>http://www.techguri.com/2009/08/28/529/#comments</comments>
		<pubDate>Fri, 28 Aug 2009 12:02:48 +0000</pubDate>
		<dc:creator>Vinod Kathail</dc:creator>
				<category><![CDATA[High Level Synthesis]]></category>

		<guid isPermaLink="false">http://www.techguri.com/?p=529</guid>
		<description><![CDATA[By now you might have seen our announcement on acquiring the Esterel Studio tool suite. This technology is complementary to our leading PICO algorithmic synthesis platform and was already part of an integrated flow used by several of our customers. Talk to the users of Esterel Studio and they will tell you that Esterel is [...]]]></description>
			<content:encoded><![CDATA[<p>By now you might have seen our announcement on acquiring the Esterel Studio tool suite. <a href="http://www.synfora.com/news/press/082709.html">This technology is complementary to our leading PICO algorithmic synthesis platform and was already part of an integrated flow used by several of our customers.</p>
<p>Talk to the users of Esterel Studio and they will tell you that Esterel is a great language and Esterel Studio is a great product for delivering complex control IP that has been formally verified. Esterel solved a hard problem with an elegant product. However, this was not enough as the design community moved towards C/C++ as a way of designing complex accelerators. Designers want the capability to build systems that encompass both application accelerator and control IP within a single integrated environment.<br />
Technically, there are several key technologies that can be migrated into a C based flow extending its capabilities to design efficient hardware for parallel, hierarchical state machines with the need to react rapidly to real time inputs. This will leverage Esterel’s powerful compilation and verification technology and close an open issue when building complex systems in hardware.</p>
<p>We have absorbed the technology and ensured that the users have a stable and maintained product while we work out how we can use this exciting technology to further extend the reach of C synthesis into larger and more complex systems.</p>
<p>We have long believed in the power of the Esterel Studio technology and we can now see a way to providing it to a larger audience. </p>
]]></content:encoded>
			<wfw:commentRss>http://www.techguri.com/2009/08/28/529/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>How to select and deploy the ESL solution that is right for you?</title>
		<link>http://www.techguri.com/2009/05/27/how-to-select-and-deploy-the-esl-solution-that-is-right-for-you/</link>
		<comments>http://www.techguri.com/2009/05/27/how-to-select-and-deploy-the-esl-solution-that-is-right-for-you/#comments</comments>
		<pubDate>Thu, 28 May 2009 00:53:18 +0000</pubDate>
		<dc:creator>Fernando Martinez</dc:creator>
				<category><![CDATA[High Level Synthesis]]></category>
		<category><![CDATA[Algorithmic Synthesis]]></category>
		<category><![CDATA[esl]]></category>
		<category><![CDATA[QoR]]></category>

		<guid isPermaLink="false">http://www.techguri.com/?p=300</guid>
		<description><![CDATA[ESL tools have for a long time had a bad reputation as promising the moon and delivering next to nothing. While there are still tools out there which don&#8217;t deliver value to the designer, I&#8217;m happy to say ESL tools have evolved and this segment of the EDA space is maturing. Tools like our own [...]]]></description>
			<content:encoded><![CDATA[<p>ESL tools have for a long time had a bad reputation as promising the moon and delivering next to nothing. While there are still tools out there which don&#8217;t deliver value to the designer, I&#8217;m happy to say ESL tools have evolved and this segment of the EDA space is maturing. Tools like our own PICO Extreme development environment are being used in both the ASIC and FPGA domains to create real designs going into products shipping today.</p>
<p>The companies widely deploying this type of design methodology have made the mental leap to move to a design abstraction higher than RTL for better productivity and faster design turn around time. Others are being pushed by increasing design complexity, need for cost reductions, and design cycle cuts to find a better way to create their IP. Seeing that the shift from RTL to a higher level of abstraction is gaining in momentum, I got to thinking how does a designer considering an ESL solution separate the wheat from the chaff and select the best ESL solution?</p>
<p>One obvious criteria is cost, but more importantly is what does the solution deliver. What is the design entry language? How is quality of results (QoR) achieved? How is performance met? Can the target performance be met? How is the design verified? An FPGA designer might also ask&#8230;..My design is on a expensive large prototyping part&#8230;..how do I migrate to a low cost mass volume FPGA?  An ASIC designer might also ask&#8230;&#8230;What about an ECO?</p>
<p>Before diving into the detailed answer for each of these questions, one question seems to be the first one that must be answered</p>
<p>How to select and deploy an ESL tool successfully both for an initial project and across the entire design team?</p>
<p>The exact answer depends on the type of designs created and the dynamics of the team. Having said that, here are some guidelines of things to keep in mind when selecting an ESL solution:</p>
<p>1. Use the highest level of abstraction possible</p>
<p>RTL coding is tedious, time consuming, but familiar. Many tools out there offer control and coding that is very similar to RTL. The user is giving complete freedom to specify everything happening on a clock cycle, parallelism, microarchitecture, etc. The only advantage is you don&#8217;t have to know VHDL/Verilog to generate RTL, but you are still creating RTL by hand. One thing to think about, if writing RTL is not keeping up with your needs&#8230;..why would writing RTL in another language be any different?  The designer should strive to use a methodology that keeps the design code as close to the algorithm description as possible. The less changes you have to make from algorithmic code to the hardware implementation the easier it is to share and reuse IP.</p>
<p>2.  Focus on QoR equal to hand design</p>
<p>Area in an ASIC is expensive. Changing from one FPGA into a larger one can also turn out to be very costly. No matter how good the tool looks or how easy it is to use, if the quality of results is not on par with hand written RTL, your management will not allow you to use the tool on a production quality design.</p>
<p>3. Don&#8217;t debug in the RTL</p>
<p>Raising the level of design abstraction needs to also cover verification and not just design entry. Verification tends to always dominate the design cycle. Debugging at the RTL level is slow, error prone, tedious, and difficult. If the ESL solution you are considering doesn&#8217;t raise verification out of the RTL level, any productivity gain in design entry is meaningless. Look for a tool that allows you to debug at the same level as your design entry. For example, a C based tool should allow you to to all debugging at the C level. The RTL simulation should be a final validation that everything is fine with the design. It should not be the place where all issues in the design are discovered.</p>
<p>4. Don&#8217;t benchmark or make flow decisions on a small block</p>
<p>Testing an ESL tool on something simple such as an FIR filter is fine for getting an initial feel for the software. Making a decision on a tool choice or flow changes requires a realistic example. Be prepared to invest the time in testing the tool with a design representative of what you are building. Testing the approach with a realistic test case will save headaches later. Last thing you want to do is propose your company acquires a tool, which does not solve any of your problems.</p>
<p>5. Take advantage of high level abstraction for optimization</p>
<p>One of the big advantages of ESL tools is the quick turn around from algorithm change to new hardware implementation. Take advantage of this speed to explore new architectures, explore power reduction at the algorithm level, explore different algorithms to solve the same problem. Look for built-in analysis utilities which give you complete information about every aspect of your design. What&#8217;s the dataflow, where is the performance bottleneck coming from, where is the area going&#8230;.are all basic questions your ESL tool should quickly answer.</p>
<p>6. Build a hierarchical methodology that promotes IP sharing and reuse</p>
<p>Think about how you want to reuse your algorithmic IP. From an algorithm point, the same codec can go in a set top box or in a cell phone&#8230;&#8230;the hardware and performance requirement for each case is completely different. Look for solutions which allow you to freely use your algorithms across different projects with little or no modification. Changes in terms of clock frequency and throughput should be an input to the tool and separate from the algorithm code. Look for the capability of pulling in as much of the design as possible into the ESL solution. Chopping up an algorithm into small blocks, building them in a tool and manually stiching them together doesn&#8217;t work. Also, think about the capacity constraints of a tool and the size of designs you are building. How large of a design can you build with a tool and how large a design do you need to build? Are you looking at a few thousand gates, a few hundred thousand gates or beyond a million gates?</p>
<p>7. Assume architectural awareness in the code, but don&#8217;t code in details</p>
<p>Expecting any tool to create optimized hardware from random code is unrealistic. You should expect to embed architectural awareness into your code. By architectural awareness, it means that the code should have a notion of how the memory is organized, how inter-block communication is happening. Communication between blocks in terms of streams vs shared memories vs scalars are natural architectural decisions which should only be made by a designer.</p>
<p>Things like explicitly stating parallelism and having to control the level of pipelining should not be required of the user. Forcing the user to deal with these things lowers the level of abstraction and productivity. If the tool you are considering requires you to state parallelism, pipelining and other fine level of details, it is not the right tool to solve your RTL design issues.</p>
<p>8. Build and end-to-end flow: algorithm -&gt; physical implementation</p>
<p>Getting an algorithm from C to RTL is not enough. You need to pass verification, synthesize and close timing on the ASIC library or FPGA part with a netlist that is area optimized. Look for elevating the verification of the IP up to the level of design entry.  If you are designing in C, it only makes sense to verify and debug in C and not have to wait until RTL. Tools which only allow you to debug/verify at the RTL level shouldn&#8217;t even be considered. Also, look for integration into industry standard back-end flows for place and route. Consider vendors who can help you create and end-to-end solution, leaving you to figure out everything on your own just doesn&#8217;t work.</p>
<p>9. Develop internal expertise</p>
<p>Moving design abstraction from RTL to something like C is a major methodology change for any design team. Be prepared in developing the time in developing in-house expertise on the new methodology. A phased approach, which leverages local experts, is the safest bet for a successful deployment. Trying to move all designers at once will just create tension, discontent and sets up your ESL adoption for failure.</p>
<p>10. Plan for what you will need 2 to 5 years down the line</p>
<p>Look at your product roadmap, what kind of designs are you expecting in terms of complexity and size? Look at the roadmap of your ESL vendor, does their plan track with what you expect to need? Keep in mind that once you switch from hand written RTL to a higher level of abstraction, the methodology change is here to stay. You won&#8217;t be trying ESL and then retooling to go back to hand written RTL. You will need to make sure that your preferred ESL vendor not only helps in your immediate problems, but that they also have a plan for supporting your needs in the future. Changing vendors is much easier than changing design methodology&#8230;&#8230;..but if by doing a little bit of research you can find the perfect fit the first time around, why wouldn&#8217;t you?</p>
]]></content:encoded>
			<wfw:commentRss>http://www.techguri.com/2009/05/27/how-to-select-and-deploy-the-esl-solution-that-is-right-for-you/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>High Level Synthesis</title>
		<link>http://www.techguri.com/2009/05/07/high-level-synthesis/</link>
		<comments>http://www.techguri.com/2009/05/07/high-level-synthesis/#comments</comments>
		<pubDate>Thu, 07 May 2009 17:17:34 +0000</pubDate>
		<dc:creator>TechGuri Administration</dc:creator>
				<category><![CDATA[High Level Synthesis]]></category>

		<guid isPermaLink="false">http://www.techguri.com/?p=209</guid>
		<description><![CDATA[Find out more about High Level Synthesis right here...]]></description>
			<content:encoded><![CDATA[<p>Find out more about High Level Synthesis right here&#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.techguri.com/2009/05/07/high-level-synthesis/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
